| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
coro.destroy.
This is the forth patch in the coroutine series. CoroEaly pass now lowers coro.resume
and coro.destroy intrinsics by replacing them with an indirect call to an address
returned by coro.subfn.addr intrinsic. This is done so that CGPassManager recognizes
devirtualization when CoroElide replaces a call to coro.subfn.addr with an appropriate
function address.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22998
llvm-svn: 277765
 | 
| | 
| 
| 
| 
| 
|  | 
constant vectors
llvm-svn: 277762
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown.
Patch by Aaron En Ye Shi.
Differential Revision: https://reviews.llvm.org/D22964
llvm-svn: 277759
 | 
| | 
| 
| 
| 
| 
| 
|  | 
This eliminates the remnants of std::error_code from the
DebugInfo libraries.
llvm-svn: 277758
 | 
| | 
| 
| 
| 
| 
| 
|  | 
It leads to a crash when they're not. I'm *sure* I've made this mistake before,
at least once.
llvm-svn: 277755
 | 
| | 
| 
| 
| 
| 
|  | 
constant vectors
llvm-svn: 277752
 | 
| | 
| 
| 
|  | 
llvm-svn: 277749
 | 
| | 
| 
| 
|  | 
llvm-svn: 277747
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Previously, FastISel for WebAssembly wasn't checking the return value of
`getRegForValue` in certain cases, which would generate instructions
referencing NoReg. This patch fixes this behavior.
Patch by Dominic Chen
Differential Revision: https://reviews.llvm.org/D23100
llvm-svn: 277742
 | 
| | 
| 
| 
|  | 
llvm-svn: 277740
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
constant vectors
I'm removing a misplaced pair of more specific folds from InstCombine in this patch as well,
so we know where those folds are happening in InstSimplify.
llvm-svn: 277738
 | 
| | 
| 
| 
| 
| 
|  | 
In preparation for adding a binary permute matching function.
llvm-svn: 277737
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
adjustments.
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.
Previous patch (D22936) was reverted because tests were failing. This patch also fixes the cause of those failures:
- x86 failing tests either did not have the right target, or the right alignment.
- NVPTX failing tests did not have the right alignment.
- AMDGPU failing test (merge-stores) should allow vectorization with the given alignment but the target info
  considers <3xi32> a non-standard type and gives up early. This patch removes the condition and only checks
  for a maximum size allowed and relies on the next condition checking for %4 for correctness.
  This should be revisited to include 3xi32 as a MVT type (on arsenm's non-immediate todo list).
Note that checking the sizeInBits for a MVT is undefined (leads to an assertion failure),
so we need to create an EVT, hence the interface change in allowsMisaligned to include the Context.
Reviewers: arsenm, jlebar, tstellarAMD
Subscribers: jholewinski, arsenm, mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D23068
llvm-svn: 277735
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Reviewers: seanbruno, sdardis
Subscribers: tberghammer, danalbert, srhines, dsanders, sdardis, llvm-commits, seanbruno
Differential Revision: https://reviews.llvm.org/D23113
llvm-svn: 277732
 | 
| | 
| 
| 
| 
| 
|  | 
constant vectors
llvm-svn: 277731
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
NFCI.
The new function now returns true if the shuffle should be commuted.
This will allow target shuffle combines to share the code.
llvm-svn: 277728
 | 
| | 
| 
| 
|  | 
llvm-svn: 277727
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
On modern Intel processors hardware SQRT in many cases is faster than RSQRT
followed by Newton-Raphson refinement. The patch introduces a simple heuristic
to choose between hardware SQRT instruction and Newton-Raphson software
estimation.
The patch treats scalars and vectors differently. The heuristic is that for
scalars the compiler should optimize for latency while for vectors it should
optimize for throughput. It is based on the assumption that throughput bound
code is likely to be vectorized.
Basically, the patch disables scalar NR for big cores and disables NR completely
for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores.
Secondly, vector SQRT has been greatly improved in Skylake and has better
throughput compared to NR.
Differential Revision: https://reviews.llvm.org/D21379
llvm-svn: 277725
 | 
| | 
| 
| 
| 
| 
|  | 
Differential Revision: https://reviews.llvm.org/D22347
llvm-svn: 277719
 | 
| | 
| 
| 
|  | 
llvm-svn: 277716
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
lowering - this is what cost models are for
Improved CTTZ/CTLZ costings will be added shortly
llvm-svn: 277713
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Enable tail calls by default for (micro)MIPS(64).
microMIPS is slightly more tricky than doing it for MIPS(R6) or microMIPSR6.
microMIPS has two instruction encodings: 16bit and 32bit along with some
restrictions on the size of the instruction that can fill the delay slot.
For safe tail calls for microMIPS, the delay slot filler attempts to find
a correct size instruction for the delay slot of TAILCALL pseudos.
Reviewers: dsanders, vkalintris
Subscribers: jfb, dsanders, sdardis, llvm-commits
Differential Revision: https://reviews.llvm.org/D21138
llvm-svn: 277708
 | 
| | 
| 
| 
|  | 
llvm-svn: 277704
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This should ensure that we can atomically write two bytes (on top of the
retq and the one past it) and have those two bytes not straddle cache
lines.
We also move the label past the alignment instruction so that we can refer
to the actual first instruction, as opposed to potential padding before the
aligned instruction.
Update the tests to allow us to reflect the new order of assembly.
Reviewers: rSerge, echristo, majnemer
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23101
llvm-svn: 277701
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: As per title.
Reviewers: majnemer, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23139
llvm-svn: 277694
 | 
| | 
| 
| 
|  | 
llvm-svn: 277693
 | 
| | 
| 
| 
| 
| 
| 
|  | 
This reinstates r277611 + r277614 and reverts r277642.  A cast_or_null
should have been a dyn_cast_or_null.
llvm-svn: 277691
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This reverts commits r277685 & r277688. r277685 broke compiler-rt
compilation http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/23335
and r277685 is a followup from it.
llvm-svn: 277690
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
overloaded (and simpler).
Sean rightly pointed out in code review that we've started using
"wrapper pass" as a specific part of the old pass manager, and in fact
it is more applicable there. Here, we really have a pass *template* to
build a repeated pass, so call it that.
llvm-svn: 277689
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
As we addressed all compilation time problems with GVN-hoist
https://llvm.org/bugs/show_bug.cgi?id=28670
this patch turns GVN-hoist back by default.
Differential Revision: https://reviews.llvm.org/D23136
llvm-svn: 277685
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
pdbdump calls DbiStreamBuilder::commit through PDBFileBuilder::commit
without calling DbiStreamBuilder::finalize. Because `finalize` initializes
`Header` member, `Header` remained nullptr which caused a crash bug.
Differential Revision: https://reviews.llvm.org/D23143
llvm-svn: 277681
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
rewriteOperands() always performed liveness queries at the base index
rather than the RegSlot/Base as apropriate for the machine operand. This
could lead to illegal rewriting in some cases.
llvm-svn: 277661
 | 
| | 
| 
| 
| 
| 
|  | 
constant vectors
llvm-svn: 277659
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
ErrorOr<>
changing them to Expected<> to allow them to pass through llvm Errors.
No functional change.
This commit by itself will break the next lld builds.  I’ll be committing the
matching change for lld immediately next.
llvm-svn: 277656
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This patch fixes pr25548.
Current implementation of PPCBoolRetToInt doesn't handle CallInst correctly, so it failed to do the intended optimization when there is a CallInst with parameters. This patch fixed that.
llvm-svn: 277655
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
into mov.w"
This reverts commit r277610 / d619aa8878c3dafcc0d29a46517f63ff3209fdd4.
This make subtarget-no-movt.ll fail in
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/26892,
llvm-svn: 277654
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Not a correctness issue, but it would be nice if we didn't have to
recompute our block numbering (worst-case) every time we move MSSA.
llvm-svn: 277652
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Limit the number of times the while(1) loop is executed. With this restriction
the number of hoisted instructions does not change in a significant way on the
test-suite.
Differential Revision: https://reviews.llvm.org/D23028
llvm-svn: 277651
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
With this patch we compute the DFS numbers of instructions only once and update
them during the code generation when an instruction gets hoisted.
Differential Revision: https://reviews.llvm.org/D23021
llvm-svn: 277650
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
With this patch we compute the MemorySSA once and update it in the code generator.
Differential Revision: https://reviews.llvm.org/D22966
llvm-svn: 277649
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
This reverts commit r277611 and the followup r277614.
Bootstrap builds and chromium builds are crashing during inlining after
this change.
llvm-svn: 277642
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Didn't want to fold this in with r277640, since it touches bits that
aren't entirely related to r277640.
llvm-svn: 277641
 | 
| | 
| 
| 
| 
| 
| 
|  | 
This is a follow-up to r277637. It teaches MemorySSA that invariant
loads (and loads of provably constant memory) are always liveOnEntry.
llvm-svn: 277640
 | 
| | 
| 
| 
| 
| 
|  | 
constant vectors
llvm-svn: 277638
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This patch makes MemorySSA recognize atomic/volatile loads, and makes
MSSA treat said loads specially. This allows us to be a bit more
aggressive in some cases.
Administrative note: Revision was LGTM'ed by reames in person.
Additionally, this doesn't include the `invariant.load` recognition in
the differential revision, because I feel it's better to commit that
separately. Will commit soon.
Differential Revision: https://reviews.llvm.org/D16875
llvm-svn: 277637
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
aggressive cast-folding
Summary:
InstCombine unfolds expressions of the form `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` such that in a later iteration of InstCombine the exposed `zext(icmp)` instructions can be optimized. We now combine this unfolding and the subsequent `zext(icmp)` optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of `logic(cast(icmp), cast(icmp))` expressions to `cast(logic(icmp, icmp))` which had been disabled due to its interference with the unfolding transformation.
Tested via `make check` and `lnt`.
Background
==========
For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of `logic(cast(icmp), cast(icmp))` to `cast(logic(icmp, icmp))` which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in `visitZExt()` in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds `zext(or(icmp, icmp))` to `or(zext(icmp), zext(icmp))` in order to expose `zext(icmp)` operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these `zext(icmp)` would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the `zext`-unfolding and the elimination of the exposed `zext(icmp)` to happen at one go because this enables us to still allow the cast-folding in `logic(cast(icmp), cast(icmp))` without entering an endless loop again.
Details on the submitted changes
================================
- In `visitZExt()` we combine the unfolding and optimization of `zext` instructions.
- In `transformZExtICmp()` we have to use `Builder->CreateIntCast()` instead of `CastInst::CreateIntegerCast()` to make sure that the new `CastInst` is inserted in a `BasicBlock`. The new calls to `transformZExtICmp()` that we introduce in `visitZExt()` would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of `replaceInstUsesWith()` is necessary to ensure that the new `CastInst` replaces the `ZExtInst` accordingly.
- In InstCombineAndOrXor.cpp we again allow the folding of casts on `icmp` instructions.
- The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes.
- The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105.
Reviewers: grosser, majnemer, spatel
Subscribers: eli.friedman, majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D22864
Contributed-by: Matthias Reisinger <d412vv1n@gmail.com>
llvm-svn: 277635
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Patch by Aditya Kumar.
Differential Revision: https://reviews.llvm.org/D22967
llvm-svn: 277634
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
We currently only support combining target shuffles that consist of a single source input (plus elements known to be undef/zero).
This patch generalizes the recursive combining of the target shuffle to collect all the inputs, merging any duplicates along the way, into a full set of src ops and its shuffle mask.
We uncover a number of cases where we have failed to combine a unary shuffle because the input has been duplicated and separated during lowering.
This will allow us to combine to 2-input shuffles in a future patch.
Differential Revision: https://reviews.llvm.org/D22859
llvm-svn: 277631
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This reverts commit the revert commit r277627. The build errors
mentioned in r277627 were likely caused by an unclean build directory.
Sorry for the noise.
llvm-svn: 277630
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
splat vectors
This removes the restriction for the icmp constant, but as noted by the FIXME comments, 
we still need to change individual checks for binop operand constants.
llvm-svn: 277629
 |