| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
operand.
This matches the Intel and AMD documentation and is consistent with the LAR instruction.
llvm-svn: 325197
|
| |
|
|
|
|
| |
The match would be ambiguous, but at&t asm parsing doesn't support ambiguous matches and will just return the first.
llvm-svn: 325192
|
| |
|
|
| |
llvm-svn: 325190
|
| |
|
|
|
|
| |
This should work with vector constants too, but it's currently limited to scalar.
llvm-svn: 325187
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
TypeID summaries are used by CFI and need to be serialized by ThinLTO
indexing for later use by LTO Backend.
Reviewers: tejohnson, pcc
Subscribers: mehdi_amini, inglorion, eraman, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D42611
llvm-svn: 325182
|
| |
|
|
|
|
|
|
|
| |
unified GetResources callback.
Having a single 'GetResources' callback will simplify adding new resources in
the future.
llvm-svn: 325180
|
| |
|
|
|
|
|
| |
Queries need to stay alive until each owner has set the values they are
responsible for.
llvm-svn: 325179
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The bound instruction does not have reversed operands in gas.
Fixes PR27653.
Patch by Maya Madhavan.
Differential Revision: https://reviews.llvm.org/D43243
llvm-svn: 325178
|
| |
|
|
| |
llvm-svn: 325169
|
| |
|
|
|
|
|
|
| |
Reorder them to match MIR.
Predecessors are only comments, and they're not usually printed in MIR.
llvm-svn: 325166
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
more opcodes
Summary:
This patch adds templated functions to MachineIRBuilder for some opcodes
and adds pattern matcher support for G_AND and G_OR.
Reviewers: aditya_nandakumar
Reviewed By: aditya_nandakumar
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D43309
llvm-svn: 325162
|
| |
|
|
|
|
|
| |
It can never be null and most callers were already using references or
std::unique_ptr.
llvm-svn: 325160
|
| |
|
|
|
|
|
| |
This simplifies most callers as they are already using references or
std::unique_ptr.
llvm-svn: 325155
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
reserved regs or callee saved regs change
Previously we only invalidated the pressure set limit cached when the TargetRegisterInfo pointer changes. But as reserved regs and callee saved regs are used as part of calculating the limits we should invalidate when those change too.
I encountered this when reverting a patch from the 6.0 branch. One of the x86 test files had a function that used rbp as a frame pointer, making it reserved. It was followed by another function which didn't use rbp but had the same TRI so the pressure set limit cache was not invalidated. If i removed the function that used rbp as a frame pointer from the file, the remaining function then got a different register pressure limit for the GR16 pressure set. This caused the machine scheduler to change the scheduling for the function. This was an unexpected change from just deleting a function.
I don't have a test case for trunk because the particular x86 test case is different enough from the 6.0 branch to not be affected now.
Differential Revision: https://reviews.llvm.org/D43274
llvm-svn: 325153
|
| |
|
|
|
|
|
|
|
| |
Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h,
into the corresponding LoopUtils.cpp, as opposed to LICM where it resides
at the moment. This will allow other functions from Transforms/Utils
to reference it.
llvm-svn: 325151
|
| |
|
|
|
|
|
|
|
|
| |
between PACK*SDW/PACK*SWB
Try to keep PACK*SDW/PACK*SWB as wide as possible, this helps ComputeNumSignBits as it can only peek through bitcasts to wider types, pre-AVX2 codegen was already doing this as it could peek through bitcasts/subvectors more easily than AVX2 could through shuffles.
This shouldn't affect existing results as calls to truncateVectorWithPACK ensure we have enough sign bits to pack to the same value, but it should make it possible to use truncateVectorWithPACK chains to perform saturation in combineTruncateWithSat with a future patch.
llvm-svn: 325149
|
| |
|
|
|
|
|
|
|
|
|
|
| |
select(C, Z, binop(Y, W)) if the binop is rem or div.
The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe.
Fixes PR36362.
Differential Revision: https://reviews.llvm.org/D43276
llvm-svn: 325148
|
| |
|
|
|
|
|
|
|
| |
Kernel arguments likely read by all workitems and should not bypass
cache. Fixes performance hit in sub-dword argument loads.
Differential Revision: https://reviews.llvm.org/D43249
llvm-svn: 325146
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The prologue-end line record must be emitted after the last
instruction that is part of the function frame setup code and before
the instruction that marks the beginning of the function body.
Patch by Carlos Alberto Enciso!
Differential Revision: https://reviews.llvm.org/D41762
llvm-svn: 325143
|
| |
|
|
| |
llvm-svn: 325142
|
| |
|
|
| |
llvm-svn: 325141
|
| |
|
|
|
|
|
| |
This keeps with our current usage of 'match' and is easier to see that
the optional NSW only applies in the non-constant operand case.
llvm-svn: 325140
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So that macros defined in inline assembly blocks are available to the
whole file.
This provides a consistent behavior with other assembly directives,
since equations for example are already preserved between inline
assembly blocks.
PR: 36110
Patch by Roger!
llvm-svn: 325139
|
| |
|
|
|
|
|
|
| |
When creating high MachineMemOperand for MSTORE/MLOAD we supply
it with the original PointerInfo, while the pointer itself had been incremented.
The patch adds the proper offset to the PointerInfo.
llvm-svn: 325135
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reversed loads are handled as gathering. But we can just reshuffle
these values. Patch adds support for vectorization of reversed loads.
Reviewers: RKSimon, spatel, mkuper, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43022
llvm-svn: 325134
|
| |
|
|
|
|
|
|
| |
This adds support for handling f16 stack spills/reloads.
Differential Revision: https://reviews.llvm.org/D43280
llvm-svn: 325130
|
| |
|
|
| |
llvm-svn: 325129
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory.
A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load.
The estimated penalty for a store forward block is ~13 cycles.
This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence
of a load and a store.
The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies.
breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM.
Change-Id: Ic41aa9ade6512e0478db66e07e2fde41b4fb35f9
llvm-svn: 325128
|
| |
|
|
|
|
|
|
| |
truncation
While the AVX512 VTRUNCS/VTRUNCUS instructions require legal types, truncateVectorWithPACK handles cases with multiples of legal types through splitting/concatenation. So we just need to ensure that the src/dst scalar types are correct and leave truncateVectorWithPACK to handle the rest of it.
llvm-svn: 325127
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instrs before call.
For basic blocks with instructions between the beginning of the block
and a call we have to duplicate the instructions before the call in all
split blocks and add PHI nodes for uses of the duplicated instructions
after the call.
Currently, the threshold for the number of instructions before a call
is quite low, to keep the impact on binary size low.
Reviewers: junbuml, mcrosier, davidxl, davide
Reviewed By: junbuml
Differential Revision: https://reviews.llvm.org/D41860
llvm-svn: 325126
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
We can use incremental dominator tree updates to avoid re-calculating
the dominator tree after interchanging 2 loops.
Reviewers: dmgreen, kuhar
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D43176
llvm-svn: 325122
|
| |
|
|
|
|
|
|
|
|
| |
Preserve debug info from a dead 'and' instruction with a constant.
Patch by Djordje Todorovic.
Differential Revision: https://reviews.llvm.org/D43163
llvm-svn: 325119
|
| |
|
|
|
|
|
| |
The "knownValuesUnicode" test in the patch fails on ppc64 and arm64
bots. Reverting while I investigate.
llvm-svn: 325115
|
| |
|
|
| |
llvm-svn: 325110
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch implements a variant of the DJB hash function which folds the
input according to the algorithm in the Dwarf 5 specification (Section
6.1.1.4.5), which in turn references the Unicode Standard (Section 5.18,
"Case Mappings").
To achieve this, I have added a llvm::sys::unicode::foldCharSimple
function, which performs this mapping. The implementation of this
function was generated from the CaseMatching.txt file from the Unicode
spec using a python script (which is also included in this patch). The
script tries to optimize the function by coalescing adjecant mappings
with the same shift and stride (terms I made up). Theoretically, it
could be made a bit smarter and merge adjecant blocks that were
interrupted by only one or two characters with exceptional mapping, but
this would save only a couple of branches, while it would greatly
complicate the implementation, so I deemed it was not worth it.
Since we assume that the vast majority of the input characters will be
US-ASCII, the folding hash function has a fast-path for handling these,
and only whips out the full decode+fold+encode logic if we encounter a
character outside of this range. It might be possible to implement the
folding directly on utf8 sequences, but this would also bring a lot of
complexity for the few cases where we will actually need to process
non-ascii characters.
Reviewers: JDevlieghere, aprantl, probinson, dblaikie
Subscribers: mgorny, hintonda, echristo, clayborg, vleschuk, llvm-commits
Differential Revision: https://reviews.llvm.org/D42740
llvm-svn: 325107
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout.
p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits.
The index size parameter is optional, if not specified, it is equal to the pointer size.
Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width.
It works fine if you can convert pointer to integer for address calculation and all registered targets do this.
But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout.
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html
I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account.
This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size.
Differential Revision: https://reviews.llvm.org/D42123
llvm-svn: 325102
|
| |
|
|
|
|
| |
This exact code already exists a little further up.
llvm-svn: 325101
|
| |
|
|
|
|
| |
Follow-up to r325049
llvm-svn: 325085
|
| |
|
|
| |
llvm-svn: 325069
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
r323681
* Document most API's
* Delete a useless function call
* Fix a discrepancy between the single and multi-opcode variants of
getActionDefinitions().
The multi-opcode variant now requires that more than one opcode is requested.
Previously it acted much like the single-opcode form but unnecessarily
enforced the requirements of the multi-opcode form.
llvm-svn: 325067
|
| |
|
|
|
|
|
|
|
|
| |
This preserves an additional 581 unique source variables in a stage2
build of clang (according to `llvm-dwarfdump --statistics`). It
increases the size of the .debug_loc section by 0.1% (or 87139 bytes).
Differential Revision: https://reviews.llvm.org/D43255
llvm-svn: 325063
|
| |
|
|
|
|
|
|
|
|
|
| |
This replaces the bit-tracking based fold that did the same thing,
but it only worked for scalars and not directly.
There is no evidence in existing regression tests that the greater
power of bit-tracking was needed here, but we should be aware of
this potential loss of optimization.
llvm-svn: 325062
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Instead of solving the hard problem of how to pass the callee to the indirect
jump thunk without a register, just use a CSR. At a call boundary, there's
nothing stopping us from using a CSR to hold the callee as long as we save and
restore it in the prologue.
Also, add tests for this mregparm=3 case. I wrote execution tests for
__llvm_retpoline_push, but they never got committed as lit tests, either
because I never rewrote them or because they got lost in merge conflicts.
Reviewers: chandlerc, dwmw2
Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D43214
llvm-svn: 325049
|
| |
|
|
|
|
|
|
|
| |
This is both a functional improvement for vectors and an
efficiency improvement for scalars. The existing code below
the new folds does the same thing for scalars, but in an
indirect and expensive way.
llvm-svn: 325048
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Also make a drive-by-fix of a bug in the subregister scan code that
only triggers with an incomplete or otherwise very irregular machine
description.
rdar://problem/37404493
This re-applies r324972 with an early exit in the case of a complete
failure to make this commit NFC again as intended.
llvm-svn: 325041
|
| |
|
|
|
|
|
|
|
|
| |
According to `llvm-dwarfdump --statistics` this salvages 43 additional
unique source variables in a stage2 build of clang. It increases the
size of the .debug_loc section by 0.002% (or 2864 bytes).
Differential Revision: https://reviews.llvm.org/D43220
llvm-svn: 325035
|
| |
|
|
|
|
|
|
|
|
|
| |
It caused "Cannot select: t33: f64 = AArch64ISD::FMOV Constant:i32<0>"
in Chromium builds. See PR36369.
> Get rid of icky goto loops and make the code easier to maintain (NFC).
>
> Differential revision: https://reviews.llvm.org/D42723
llvm-svn: 325034
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Old syntax:
BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2
* %r0 = SOME_OP %r2
* %r1 = ANOTHER_OP internal %r0
New syntax:
BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 {
%r0 = SOME_OP %r2
%r1 = ANOTHER_OP internal %r0
}
llvm-svn: 325032
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D43170
llvm-svn: 325030
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(not y))
Summary:
If the and has an additional use we shouldn't invert it. That creates an additional instruction.
While there add a one use check to the transform above that looked similar.
Reviewers: spatel, RKSimon
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D43225
llvm-svn: 325019
|