| Commit message (Collapse) | Author | Age | Files | Lines | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Both of these places reference memset-like loops. Memset is precise.
Trying to keep these patches super small so they're easily post-commit
verifiable, as requested in D44748.
llvm-svn: 350044
 | 
| | 
| 
| 
|  | 
llvm-svn: 350043
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
More migration so we can disable the implicit int -> LocationSize
conversion.
All of these are either scatter/gather'ed vector instructions, or direct
loads. Hence, they're all precise.
Perhaps if we see way more getTypeStoreSize calls, we can make a
getTypeStoreLocationSize (or similar) as a wrapper that applies this
::precise. Doesn't appear that it's a good idea to make getTypeStoreSize
return a LocationSize itself, however.
llvm-svn: 350042
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
SRL/SHL+TEST to X86ISelDAGToDAG.
This cleans more code out of EmitTest.
llvm-svn: 350041
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Remove the TESTmr isel patterns and add another postprocessing combine for TESTrr+ANDrm->TESTmr. We already have a postprocessing combine for TESTrr+ANDrr->TESTrr. With this we can give ANDN a chance to match first. And clean it up during post processing if we ended up with just a regular AND.
This is another step towards my plan to gut EmitTest and do more flag handling during isel matching or by using optimizeCompare.
llvm-svn: 350038
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
It's dangerous to knowingly create an illegal vector type
no matter what stage of combining we're in.
This prevents the missed folding/scalarization seen in:
https://bugs.llvm.org/show_bug.cgi?id=40146
llvm-svn: 350034
 | 
| | 
| 
| 
|  | 
llvm-svn: 350032
 | 
| | 
| 
| 
| 
| 
| 
|  | 
The `fp` and `s8` register names are synonyms. But `fp` better reflects
a purpose of the register.
llvm-svn: 350023
 | 
| | 
| 
| 
|  | 
llvm-svn: 350022
 | 
| | 
| 
| 
| 
| 
|  | 
It's redundant to restore the `$a3` register twice.
llvm-svn: 350021
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Trying to keep these patches super small so they're easily post-commit
verifiable, as requested in D44748.
srcSize is derived from the size of an alloca, and we quit out if the
size of that is > the size of the thing we're copying to. Hence, we
should always copy everything over, so these sizes are precise.
Don't make srcSize itself a LocationSize, since optionality isn't
helpful, and we do some comparisons against other sizes elsewhere in
that function.
llvm-svn: 350019
 | 
| | 
| 
| 
| 
| 
|  | 
We won't end up using an ANDN instruction in this case so we should generate the same code we do for pre-BMI targets.
llvm-svn: 350018
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Trying to keep these patches super small so they're easily post-commit
verifiable, as requested in D44748.
This one sadly isn't *super* small, but all of the changes here are
either to:
- libfuncs that are passed a constant size (memcpy, memset, ...)
- instructions that store/load a constant size
So they have to be precise
llvm-svn: 350017
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Keeping these patches super small so they're easily post-commit
verifiable, as requested in D44748.
This tries to find literal loads/stores of the given type, so this has
to be precise.
llvm-svn: 350016
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Keeping these patches super small so they're easily post-commit
verifiable, as requested in D44748.
llvm-svn: 350015
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Keeping these patches super small so they're easily post-commit
verifiable, as requested in D44748.
llvm-svn: 350014
 | 
| | 
| 
| 
| 
| 
|  | 
instruction we use for sequentially consistent fence in 32-bit mode without SSE2.
llvm-svn: 350013
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
accessing ranges defined by low/high_pc
This is difficult/not possible to test in LLVM, but is visible as a
crash in LLD when parsing DWARF to generate gdb-index.
This function is called by llvm-dwarfdump when parsing high_pc for
non-verbose output (to print the actual high_pc rather than the low_pc
relative value), but in that case llvm-dwarfdump doesn't print section
names (if it did, it would hit this problem).
We could add some other features to llvm-dwarfdump to expose this, but
nothing really springs to my mind. I will add a test to lld, though.
llvm-svn: 350010
 | 
| | 
| 
| 
|  | 
llvm-svn: 350009
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Keeping these patches super small so they're easily post-commit
verifiable, as requested in D44748.
llvm-svn: 350008
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
trunc (add X, C ) --> add (trunc X), C'
If we're throwing away the top bits of an 'add' instruction, do it in the narrow destination type.
This makes the truncate-able opcode list identical to the sibling transform done in IR (in instcombine).
This change used to show regressions for x86, but those are gone after D55494. 
This gets us closer to deleting the x86 custom function (combineTruncatedArithmetic) 
that does almost the same thing.
Differential Revision: https://reviews.llvm.org/D55866
llvm-svn: 350006
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The missed load folding noticed in D55898 is visible independent of that change 
either with an adjusted IR pattern to start or with AVX2/AVX512 (where the build 
vector becomes a broadcast first; movddup is not produced until we get into isel 
via tablegen patterns).
Differential Revision: https://reviews.llvm.org/D55936
llvm-svn: 350005
 | 
| | 
| 
| 
| 
| 
|  | 
When dumping string or address indexes
llvm-svn: 349997
 | 
| | 
| 
| 
| 
| 
|  | 
(addr attributes coming shortly)
llvm-svn: 349996
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Currently the section name (& possibly number) is only printed on
addresses in ranges - but no reason it couldn't also be displayed on
other addresses (like low/high PC).
Refactor in that direction by pulling out the section lookup and name
ambiguity dumping logic into a reusable helper.
llvm-svn: 349995
 | 
| | 
| 
| 
|  | 
llvm-svn: 349985
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
X86::AddrBaseReg/AddrIndexReg, etc. instead of hardcoded constants.
Makes the code a little more readable.
llvm-svn: 349983
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
NVPTXAsmPrinter::doInitialization() was creating an NVPTXSubtarget on
the stack.  This object is huge, about 80kb.  Also it's slow to create.
And it's all redundant; we have one in NVPTXTargetMachine anyway!
llvm-svn: 349982
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Propagate the llvm::Error a little further up. This is NFC for
llvm-dwarfdump in this change, but allows ld.lld to emit more precise
error messages about which object and archive the erroneous DWARF is in.
llvm-svn: 349978
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
The "single parameter" .file directive appears to be an ELF-only feature
that is intended to insert the main source filename into the string
table table.
I noticed that if you assemble an ELF .s file for COFF, typically it
will assert right away on a .file directive near the top of the file. My
first change was to make this emit a proper error in the asm parser so
that we don't assert so easily.
However, COFF actually does have some support for this directive, and if
you emit an object file, llvm-mc does not assert. When emitting a COFF
object, MC will take those file names and create "debug" symbol table
entries for them. I'm not familiar with these kinds of symbol table
entries, and I'm not aware of any users of them, but @compnerd added
them a while ago. They don't introduce absolute paths, and most main
source file paths are short enough that this extra entry shouldn't cause
any problems, so I enabled the flag in MCAsmInfoCOFF that indicates that
it's supported.
This has the side effect of adding an extra debug symbol to every object
produced by clang, which is a pretty big functional change. My question
is, should we keep the functionality or remove it in the name of symbol
table minimalism?
Reviewers: mstorsjo, compnerd
Subscribers: hiraditya, compnerd, llvm-commits
Differential Revision: https://reviews.llvm.org/D55900
llvm-svn: 349976
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D56030
llvm-svn: 349975
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
Added a pair of APIs for encoding/decoding the 3 components of a DWARF discriminator described in http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html: the base discriminator, the duplication factor (useful in profile-guided optimization) and the copy index (used to identify copies of code in cases like loop unrolling)
The encoding packs 3 unsigned values in 32 bits. This CL addresses 2 issues:
- communicates overflow back to the user
- supports encoding all 3 components together. Current APIs assume a sequencing of events. For example, creating a new discriminator based on an existing one by changing the base discriminator was not supported.
Reviewers: davidxl, danielcdh, wmi, dblaikie
Reviewed By: dblaikie
Subscribers: zzheng, dmgreen, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D55681
llvm-svn: 349973
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
CU is empty (devoid of code addresses)
Originally committed in r349333, reverted in r349353.
GCC emitted these unconditionally on/before 4.4/March 2012
Clang emitted these unconditionally on/before 3.5/March 2014
This improves performance when parsing CUs (especially those using split
DWARF) that contain no code ranges (such as the mini CUs that may be
created by ThinLTO importing - though generally they should be/are
avoided, especially for Split DWARF because it produces a lot of very
small CUs, which don't scale well in a bunch of other ways too
(including size)).
The revert was due to a (Google internal) test that had some checked in old
object files missing DW_AT_ranges. That's since been fixed.
llvm-svn: 349968
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an
llvm.lifetime.start or an llvm.lifetime.end intrinsic.
This was suggested as a cleanup in D55967.
Differential Revision: https://reviews.llvm.org/D56019
llvm-svn: 349964
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
turned the root nodes into one of the flag producing binops.
This fixes the patterns that have or/and as a root. 'and' is handled differently since thy usually have a CMP wrapped around them.
I had to look for uses of the CF flag because all these nodes have non-standard CF flag behavior. A real or/xor would always clear CF. In practice we shouldn't be using the CF flag from these nodes as far as I know.
Differential Revision: https://reviews.llvm.org/D55813
llvm-svn: 349962
 | 
| | 
| 
| 
|  | 
llvm-svn: 349958
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
sign flag is used.
The BEXTR instruction documents the SF bit as undefined.
The TBM BEXTR instruction has the same issue, but I'm not sure how to test it. With the control being an immediate we can determine the sign bit is 0 or the BEXTR would have been removed.
Fixes PR40060
Differential Revision: https://reviews.llvm.org/D55807
llvm-svn: 349956
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
negative in Indirect addressing.
Summary:
  Don't peel of the offset if the resulting base could possibly be negative in Indirect addressing.
This is because the M0 field is of unsigned.
This patch achieves the similar goal as https://reviews.llvm.org/D55241, but keeps the optimization
if the base is known unsigned.
Reviewers:
  arsemn
Differential Revision:
  https://reviews.llvm.org/D55568
llvm-svn: 349951
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Weak symbols are supposed to be supported in the ELF TextAPI
implementation, but the YAML handler didn't read or write the `Weak`
member of ELFSymbol. This change adds the YAML mapping and updates tests
to ensure correct behavior.
Differential Revision: https://reviews.llvm.org/D56020
llvm-svn: 349950
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
BasicAA has special logic for unescaped allocas, which normally applies
equally well to dynamic and static allocas. However, llvm.stackrestore
has the power to end the lifetime of dynamic allocas, without referring
to them directly.
stackrestore is already marked with the most conservative memory
modification attributes, but because the alloca is not escaped, the
normal logic produces incorrect results. I think BasicAA needs a special
case here to teach it about the relationship between dynamic allocas and
stackrestore.
Fixes PR40118
Reviewers: gbiv, efriedma, george.burgess.iv
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D55969
llvm-svn: 349945
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Currently, runtime unrolling does not support loops where multiple
exiting blocks exit to the latchExit. Added TODO and other code
clarifications for ConnectProlog code.
llvm-svn: 349944
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This is admittedly a narrow fix for the problem:
https://bugs.llvm.org/show_bug.cgi?id=37502
...but as the XOP restriction shows, it's a maze to get this right. 
In the motivating example, note that we have movddup before SSE4.1 and 
again with AVX2. That's because insertps isn't available pre-SSE41 and 
vbroadcast is (more generally) available with AVX2 (and the splat is 
reduced to movddup via isel pattern).
Differential Revision: https://reviews.llvm.org/D55898
llvm-svn: 349937
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Fixes PR35023.
Reviewers: MatzeB, t.p.northover, sunfish, qcolombet, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D55909
llvm-svn: 349935
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This adds support for widening G_FCEIL in LegalizerHelper and
AArch64LegalizerInfo. More specifically, it teaches the AArch64 legalizer to
widen G_FCEIL from a 16-bit float to a 32-bit float when the subtarget doesn't
support full FP 16.
This also updates AArch64/f16-instructions.ll to show that we perform the
correct transformation.
llvm-svn: 349927
 | 
| | 
| 
| 
| 
| 
|  | 
Change order of conditions in predicate.
llvm-svn: 349918
 | 
| | 
| 
| 
| 
| 
|  | 
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349915
 | 
| | 
| 
| 
| 
| 
|  | 
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349914
 | 
| | 
| 
| 
| 
| 
|  | 
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349912
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
value. NFCI.
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349911
 | 
| | 
| 
| 
| 
| 
|  | 
Continues the work started by @bogner in rL340594 to remove uses of the KnownBits output paramater version.
llvm-svn: 349909
 |