| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes to the original scalar loop during LV code gen cause the return value
of Legal->isConsecutivePtr() to be inconsistent with the return value during
legal/cost phases (further analysis and information of the bug is in D39346).
This patch is an alternative fix to PR34965 following the CM_Widen approach
proposed by Ayal and Gil in D39346. It extends InstWidening enum with
CM_Widen_Reverse to properly record the widening decision for consecutive
reverse memory accesses and, consequently, get rid of the
Legal->isConsetuviePtr() call in LV code gen. I think this is a simpler/cleaner
solution to PR34965 than the one in D39346.
Fixes PR34965.
Patch by Diego Caballero, thanks!
Differential Revision: https://reviews.llvm.org/D40742
llvm-svn: 320913
|
| |
|
|
|
|
|
|
|
|
|
| |
r307148 added an assembly mnemonic spelling correction support and enabled it
on ARM. This enables that support on PowerPC as well.
Patch by Dmitry Venikov, thanks!
Differential Revision: https://reviews.llvm.org/D40552
llvm-svn: 320911
|
| |
|
|
|
|
|
|
| |
classes LZCNT/POPCNT.
I think when this instruction was first published it was only for a Knights CPU and thus VLX version was missing.
llvm-svn: 320910
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
1. Use stream 0 only for combined module. Previously if combined module was not
processes ThinLTO used the stream for own output. However small changes in input,
could trigger combined module and shuffle outputs making life of llvm::LTO harder.
2. Always process combined module and write output to stream 0. Processing empty
combined module is cheap and allows llvm::LTO users to avoid implementing processing
which is already done in llvm::LTO.
Subscribers: mehdi_amini, inglorion, eraman, hiraditya
Differential Revision: https://reviews.llvm.org/D41267
llvm-svn: 320905
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When unsafe algerbra is allowed calls to cabs(r) can be replaced by:
sqrt(creal(r)*creal(r) + cimag(r)*cimag(r))
Patch by Paul Walker, thanks!
Differential Revision: https://reviews.llvm.org/D40069
llvm-svn: 320901
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to VPlan.h
This is a small step forward to move VPlan stuff to where it should belong (i.e., VPlan.*):
1. VP*Recipe classes in LoopVectorize.cpp are moved to VPlan.h.
2. Many of VP*Recipe::print() and execute() definitions are still left in
LoopVectorize.cpp since they refer to things declared in LoopVectorize.cpp. To
be moved to VPlan.cpp at a later time.
3. InterleaveGroup class is moved from anonymous namespace to llvm namespace.
Referencing it in anonymous namespace from VPlan.h ended up in warning.
Patch by Hideki Saito, thanks!
Differential Revision: https://reviews.llvm.org/D41045
llvm-svn: 320900
|
| |
|
|
|
|
| |
Hopefully r320864 has fixed the offending case that failed the assert.
llvm-svn: 320898
|
| |
|
|
|
|
| |
Fix incorrect placement of #endif causing NDEBUG build failures.
llvm-svn: 320897
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This implements a missing feature to allow importing of aliases, which
was previously disabled because alias cannot be available_externally.
We instead import an alias as a copy of its aliasee.
Some additional work was required in the IndexBitcodeWriter for the
distributed build case, to ensure that the aliasee has a value id
in the distributed index file (i.e. even when it is not being
imported directly).
This is a performance win in codes that have many aliases, e.g. C++
applications that have many constructor and destructor aliases.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D40747
llvm-svn: 320895
|
| |
|
|
| |
llvm-svn: 320893
|
| |
|
|
|
|
|
| |
This reverts commit 0afef672f63f0e4e91938656bc73424a8c058bfc.
Still failing at runtime on bots.
llvm-svn: 320888
|
| |
|
|
|
|
|
|
|
|
|
| |
Adds missing support for DW_FORM_data16.
Update of r320852, fixing the unittest to use a hand-coded struct
instead of std::array to guarantee data layout.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 320886
|
| |
|
|
| |
llvm-svn: 320885
|
| |
|
|
|
|
| |
The Function can never be nullptr so we can return a reference.
llvm-svn: 320884
|
| |
|
|
|
|
| |
Slight cleanup/refactor in preparation for upcoming commit.
llvm-svn: 320882
|
| |
|
|
|
|
| |
[-Werror=extra]' warning introduced by r320750
llvm-svn: 320868
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is primarily to reduce stack usage, but ordering the use queue
according to the position in the code (earlier instructions visited
before later ones) reduces the number of unnecessary bottoms due to
visiting instructions out of order, e.g.
%reg1 = copy %reg0
%reg2 = copy %reg0
%reg3 = and %reg1, %reg2
Here, reg3 should be known to be same as reg0-2, but if reg3 is
evaluated after reg1 is updated, but before reg2 is updated, the two
inputs to the and will appear different, causing reg3 to become
bottom.
llvm-svn: 320866
|
| |
|
|
| |
llvm-svn: 320865
|
| |
|
|
|
|
|
|
| |
for 32-bit mode.
This seemed to work due to a quirk in the X86 MC encoder that didn't emit a REX byte that the AND64ri8 implies when in 32-bit mode. This made the encoding the same as AND32ri8. I tried to add an assert to catch the dropped REX prefix that caught this.
llvm-svn: 320864
|
| |
|
|
|
|
|
|
| |
target specific nodes.
The target independent nodes will get legalized to the target specific nodes by their own legalization process. Someday I'd like to stop using a target specific for zero extends and truncates of legal types so the less places we reference the target specific opcode the better.
llvm-svn: 320863
|
| |
|
|
|
|
| |
When I wrote it I thought we were missing a potential optimization for KNL. But investigating further shows that for KNL we still do the optimal thing by widening to v4f32 and then using special isel patterns to widen again to zmm a register.
llvm-svn: 320862
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This recommits r320823 reverted due to the test failure in sink-foldable.ll and
an unused variable. Added "REQUIRES: aarch64-registered-target" in the test
and removed unused variable.
Original commit message:
Continue trying to sink an instruction if its users in the loop is foldable.
This will allow the instruction to be folded in the loop by decoupling it from
the user outside of the loop.
Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier
Reviewed By: hfinkel
Subscribers: javed.absar, bmakam, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D37076
llvm-svn: 320858
|
| |
|
|
|
|
| |
Unit test fails on some bots.
llvm-svn: 320857
|
| |
|
|
|
|
|
| |
PatFrag now has the atomicity information stored as bit fields. They
need to be copied to the new PatFrag.
llvm-svn: 320855
|
| |
|
|
|
|
|
|
| |
Adds missing support for DW_FORM_data16.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 320852
|
| |
|
|
|
|
| |
It seems to be failing real code which is concerning, but we were silently getting away with it. I'll investigate further.
llvm-svn: 320850
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
non-constant index
Summary:
Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1.
We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires.
For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that.
For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them.
Reviewers: RKSimon, delena, spatel, zvi
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40942
llvm-svn: 320849
|
| |
|
|
|
|
|
|
|
|
|
| |
The original memcpy expansion inserted the loop basic block inbetween
the 2 new basic blocks created by splitting the original block the memcpy
call was in. This commit makes the new memcpy expansion do the same to keep the
layout of the IR matching between the old and new implementations.
Differential Review: https://reviews.llvm.org/D41197
llvm-svn: 320848
|
| |
|
|
|
|
|
|
| |
have memory and immediate operands.
The asm parser wasn't preventing these from being accepted in 32-bit mode. Instructions that use a GR64 register are protected by the parser rejecting the register in 32-bit mode.
llvm-svn: 320846
|
| |
|
|
|
|
| |
This instruction doesn't access memory. It juse use a similar looking memory encoding. Don't require Intel syntax to put "qword ptr" in front of it.
llvm-svn: 320845
|
| |
|
|
|
|
| |
These aren't doing anything due to a top level "let Predicates =". I think the GR32/GR64 register class protects these anyway.
llvm-svn: 320844
|
| |
|
|
|
|
| |
This has no effect due to a top level "let Predicates =" around the instructions. But its also not required because the GR64 usage in the instruction guarantees it can never match.
llvm-svn: 320843
|
| |
|
|
| |
llvm-svn: 320842
|
| |
|
|
| |
llvm-svn: 320840
|
| |
|
|
|
|
|
|
| |
Fix typo in the representative instruction replacement.
Also, fix formatting and reword some comments.
llvm-svn: 320839
|
| |
|
|
| |
llvm-svn: 320838
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D40960
llvm-svn: 320837
|
| |
|
|
|
|
| |
This reverts commit r320833.
llvm-svn: 320836
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This recommit r320823 after fixing a test failure.
Original commit message:
Continue trying to sink an instruction if its users in the loop is foldable.
This will allow the instruction to be folded in the loop by decoupling it from
the user outside of the loop.
Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier
Reviewed By: hfinkel
Subscribers: javed.absar, bmakam, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D37076
llvm-svn: 320833
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
llvm-objdump's Mach-O parser was updated in r306037 to display external
relocations for MH_KEXT_BUNDLE file types. This change extends the Macho-O
parser to display local relocations for MH_PRELOAD files. When used with
the -macho option relocations will be displayed in a historical format.
All tests are passing for llvm, clang, and lld. llvm-objdump builds without
compiler warnings.
rdar://35778019
Reviewers: enderby
Reviewed By: enderby
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41199
llvm-svn: 320832
|
| |
|
|
|
|
|
|
|
|
| |
assembler in 32-bit mode.
There was a top level "let Predicates =" in the .td file that was overriding the Requires on each instruction.
I've added an assert to the code emitter to catch more cases like this. I'm sure this isn't the only place where the right predicates aren't being applied. This assert already found that we don't block btq/btsq/btrq in 32-bit mode.
llvm-svn: 320830
|
| |
|
|
|
|
| |
This reverts commit r320823.
llvm-svn: 320828
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
debug output
Work towards the unification of MIR and debug output by printing
`%stack.0` instead of `<fi#0>`, and `%fixed-stack.0` instead of
`<fi#-4>` (supposing there are 4 fixed stack objects).
Only debug syntax is affected.
Differential Revision: https://reviews.llvm.org/D41027
llvm-svn: 320827
|
| |
|
|
|
|
| |
https://reviews.llvm.org/D41291
llvm-svn: 320825
|
| |
|
|
|
|
|
|
| |
legalization if we have AVX512F, but not VLX. NFC
Previously we widened it using isel patterns.
llvm-svn: 320824
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Continue trying to sink an instruction if its users in the loop is foldable.
This will allow the instruction to be folded in the loop by decoupling it from
the user outside of the loop.
Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier
Reviewed By: hfinkel
Subscribers: javed.absar, bmakam, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D37076
llvm-svn: 320823
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following CFI directives are suported by MC but not by MIR:
* .cfi_rel_offset
* .cfi_adjust_cfa_offset
* .cfi_escape
* .cfi_remember_state
* .cfi_restore_state
* .cfi_undefined
* .cfi_register
* .cfi_window_save
Add support for printing, parsing and update tests.
Differential Revision: https://reviews.llvm.org/D41230
llvm-svn: 320819
|
| |
|
|
|
|
|
|
|
|
|
| |
SROA analysis of InlineCost can figure out that some stores can be removed
after inlining and then the repeated loads clobbered by these stores are also
free. This patch finds these clobbered loads and adjust the inline cost
accordingly.
Differential Revision: https://reviews.llvm.org/D33946
llvm-svn: 320814
|
| |
|
|
| |
llvm-svn: 320811
|
| |
|
|
| |
llvm-svn: 320806
|