| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
| |
fp16_to_fp in case the input type isn't an i16.
The bitcast can be further legalized as needed.
Fixes PR38533.
llvm-svn: 339533
|
| |
|
|
|
|
|
| |
Addresses fixme, although this should still be checking individual
operand flags.
llvm-svn: 339525
|
| |
|
|
|
|
|
|
| |
for XOR. NFCI
We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead.
llvm-svn: 339509
|
| |
|
|
| |
llvm-svn: 339508
|
| |
|
|
|
|
|
|
|
|
| |
The previous name sounds like it inserts cfguard implementation, but it
really just emits the table of address-taken functions. Change the name
to better reflect that.
Clang will be updated in the next commit.
llvm-svn: 339419
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The TType encoding, LSDA encoding, and personality encoding are all
passed explicitly by CodeGen to the assembler through .cfi_* directives,
so only the AsmPrinter needs to know about them.
The FDE CFI encoding however, controls the encoding of the label
implicitly created by the .cfi_startproc directive. That directive seems
to be special in that it doesn't take an encoding, so the assembler just
has to know how to encode one DSO-local label reference from .eh_frame
to .text.
As a result, it looks like MC will continue to have to know when the
large code model is in use. Perhaps we could invent a '.cfi_startproc
[large]' flag so that this knowledge doesn't need to pollute the
assembler.
Reviewers: davide, lliu0, JDevlieghere
Subscribers: hiraditya, fedor.sergeev, llvm-commits
Differential Revision: https://reviews.llvm.org/D50533
llvm-svn: 339397
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Similar to rL337966 - if the DAGCombiner's rotate matching was
working as expected, I don't think we'd see any test diffs here.
AArch only goes right, and PPC only goes left.
x86 has both, so no diffs there.
Differential Revision: https://reviews.llvm.org/D50091
llvm-svn: 339359
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This change provides a common optimization path for both Unsafe and FMF driven optimization for this fsub fold adding reassociation, as it the flag that most closely represents the translation
Reviewers: spatel, wristow, arsenm
Reviewed By: spatel
Subscribers: wdng
Differential Revision: https://reviews.llvm.org/D50195
llvm-svn: 339357
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The interface to get size and spill size of a register
was moved from MCRegisterInfo to TargetRegisterInfo over
a year ago. Afaik the old interface has bee around
to give out-of-tree targets a chance to adapt to the
new interface.
One problem with the old MCRegisterClass::PhysRegSize was that
it represented the size of a register as "size in bits" / 8.
So a register had to be a multiple of eight bits wide for the
size to be correct (and the byte size for the target needed to
be eight bits).
Reviewers: kparzysz, qcolombet
Reviewed By: kparzysz
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D47199
llvm-svn: 339350
|
| |
|
|
|
|
| |
As requested in D50392, pull the magic constant calculations out into a helper function.
llvm-svn: 339346
|
| |
|
|
|
|
|
|
|
| |
isNegatibleForFree() should not matter here (as the test diffs show)
because it's always a win to replace an fsub+fadd with fneg. The
problem in D50195 persists because either (1) we are doing these
folds in the wrong order or (2) we're missing another fold for fadd.
llvm-svn: 339299
|
| |
|
|
|
|
|
|
|
| |
I don't know if it's possible to expose this diff in a test,
but we should always try simplifications (no new nodes created)
before more complicated transforms for efficiency (similar to
what we do in IR).
llvm-svn: 339298
|
| |
|
|
| |
llvm-svn: 339274
|
| |
|
|
|
|
|
|
|
| |
When using APPLE extensions, don't duplicate the compiler invocation's
flags both in AT_producer and AT_APPLE_flags.
Differential revision: https://reviews.llvm.org/D50453
llvm-svn: 339268
|
| |
|
|
|
|
|
|
| |
call. NFCI.
The isConstOrConstSplat result is only used in a ISD::matchUnaryPredicate call which can perform the equivalent iteration just as quickly.
llvm-svn: 339262
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results.
For example:
```
extern int bar(int[]);
int foo(int i) {
int a[i]; // VLA
asm volatile(
"mov r7, #1"
:
:
: "r7"
);
return 1 + bar(a);
}
```
Compiled for thumb, this gives:
```
$ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb
...
foo:
.fnstart
@ %bb.0: @ %entry
.save {r4, r5, r6, r7, lr}
push {r4, r5, r6, r7, lr}
.setfp r7, sp, #12
add r7, sp, #12
.pad #4
sub sp, #4
movs r1, #7
add.w r0, r1, r0, lsl #2
bic r0, r0, #7
sub.w r0, sp, r0
mov sp, r0
@APP
mov.w r7, #1
@NO_APP
bl bar
adds r0, #1
sub.w r4, r7, #12
mov sp, r4
pop {r4, r5, r6, r7, pc}
...
```
r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block.
This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now.
The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter.
If we find a reserved register, we print a warning:
```
repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm]
"mov r7, #1"
^
```
Reviewers: eli.friedman, olista01, javed.absar, efriedma
Reviewed By: efriedma
Subscribers: efriedma, eraman, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D49727
llvm-svn: 339257
|
| |
|
|
|
|
|
|
| |
Provide a pass-through of the numerator for divide by one cases - this is the same approach we take in DAGCombiner::visitSDIVLike.
I investigated whether we could achieve this by magic MULHU/SRL values but nothing appeared to work as we don't have a way for MULHU(x,c) -> x
llvm-svn: 339254
|
| |
|
|
|
|
|
|
| |
As requested in D50392, this is a minor refactor to BuildExactSDIV to stop taking the uniform constant APInt divisor and instead extract it locally.
I also cleanup the operands and valuetypes to better match BuildUDiv (and BuildSDIV in the near future).
llvm-svn: 339246
|
| |
|
|
|
|
|
|
|
|
| |
Summary: changing a few typos
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D50445
llvm-svn: 339245
|
| |
|
|
|
|
| |
We're not handling the UDIV by one special case properly - for now just early out.
llvm-svn: 339229
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR).
Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma
Reviewed By: efriedma
Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D45437
llvm-svn: 339225
|
| |
|
|
|
|
|
|
|
|
| |
serial chain dependency.
Scatter could have multiple identical indices. We need to maintain sequential order. We get this right in LegalizeVectorTypes, but not in this code.
Differential Revision: https://reviews.llvm.org/D50374
llvm-svn: 339157
|
| |
|
|
|
|
|
|
| |
This was missed in D50185.
NFC until we add actual non-uniform support to BuildSDIV (similar BuildUDIV support in D49248) - for now it just early outs.
llvm-svn: 339147
|
| |
|
|
|
|
| |
This was missed in D49248
llvm-svn: 339146
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes an inconsistency in code generation when compiling with or
without debug information (-g). When debug information is available in
an empty block, the original test would fail, resulting in possibly
different code.
Patch by: Jeroen Dobbelaere
Differential revision: https://reviews.llvm.org/D49467
llvm-svn: 339129
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The accelerator tables use the debug_str section to store their strings.
However, they do not support the indirect method of access that is
available for the debug_info section (DW_FORM_strx et al.).
Currently our code is assuming that all strings can/will be referenced
indirectly, and puts all of them into the debug_str_offsets section.
This is generally true for regular (unsplit) dwarf, but in the DWO case,
most of the strings in the debug_str section will only be used from the
accelerator tables. Therefore the contents of the debug_str_offsets
section will be largely unused and bloating the main executable.
This patch rectifies this by teaching the DwarfStringPool to
differentiate between strings accessed directly and indirectly. When a
user inserts a string into the pool it has to declare whether that
string will be referenced directly or not. If at least one user requsts
indirect access, that string will be assigned an index ID and put into
debug_str_offsets table. Otherwise, the offset table is skipped.
This approach reduces the overall binary size (when compiled with
-gdwarf-5 -gsplit-dwarf) in my tests by about 2% (debug_str_offsets is
shrunk by 99%).
Reviewers: probinson, dblaikie, JDevlieghere
Subscribers: aprantl, mgrang, llvm-commits
Differential Revision: https://reviews.llvm.org/D49493
llvm-svn: 339122
|
| |
|
|
|
|
|
|
|
|
| |
This patch refactors the existing TargetLowering::BuildUDIV base implementation to support non-uniform constant vector denominators.
It also includes a fold for MULHU by pow2 constants to SRL which can now more readily occur from BuildUDIV.
Differential Revision: https://reviews.llvm.org/D49248
llvm-svn: 339121
|
| |
|
|
|
|
| |
Src0 doesn't really convey any meaning to what the operand is. Passthru matches what's used in the documentation for the intrinsic this comes from.
llvm-svn: 339101
|
| |
|
|
|
|
| |
getValue is more meaningful name for scatter than it is for gather. Split them and use getPassThru for gather.
llvm-svn: 339096
|
| |
|
|
| |
llvm-svn: 339059
|
| |
|
|
|
|
|
|
|
|
|
| |
This assert fires when attempting to extract a subregister from the
global PIC base register. This virtual register SD node is not in the
VRBaseMap, so we shouldn't call getVR to look it up there. If this is a
RegisterSDNode, we should be able to use the virtual register directly.
Fixes PR38385
llvm-svn: 339056
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for all the uses from the same def is done.
We run into a compile time problem with flex generated code combined with
`-fno-jump-tables`. The cause is that machineLICM hoists a lot of invariants
outside of a big loop, and drastically increases the compile time in global
register splitting and copy coalescing. https://reviews.llvm.org/D49353
relieves the problem in global splitting. This patch is to handle the problem
in copy coalescing.
About the situation where the problem in copy coalescing happens. After
machineLICM, we have several defs outside of a big loop with hundreds or
thousands of uses inside the loop. Rematerialization in copy coalescing
happens for each use and everytime rematerialization is done, shrinkToUses
will be called to update the huge live interval. Because we have 'n' uses
for a def, and each live interval update will have at least 'n' complexity,
the total update work is n^2.
To fix the problem, we try to do the live interval update work in a collective
way. If a def has many copylike uses larger than a threshold, each time
rematerialization is done for one of those uses, we won't do the live interval
update in time but delay that work until rematerialization for all those uses
are completed, so we only have to do the live interval update work once.
Delaying the live interval update could potentially change the copy coalescing
result, so we hope to limit that change to those defs with many
(like above a hundred) copylike uses, and the cutoff can be adjusted by the
option -mllvm -late-remat-update-threshold=xxx.
Differential Revision: https://reviews.llvm.org/D49519
llvm-svn: 339035
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the past, DbgInfoIntrinsic has a strong assumption that these
intrinsics all have variables and expressions attached to them.
However, it is too strong to derive the class for other debug entities.
Now, it has problems for debug labels.
In order to make DbgInfoIntrinsic as a base class for 'debug info', I
create a class for 'variable debug info', DbgVariableIntrinsic.
DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it.
Differential Revision: https://reviews.llvm.org/D50220
llvm-svn: 338984
|
| |
|
|
|
|
|
|
| |
https://reviews.llvm.org/D48600
Added IRTranslator support to translate these known intrinsics into GISel opcodes.
llvm-svn: 338944
|
| |
|
|
|
|
|
|
|
|
| |
store.
The mask operand is visited before the data operand so we need to be able to widen it.
Fixes PR38436.
llvm-svn: 338915
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.
Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.
llvm-svn: 338910
|
| |
|
|
|
|
|
|
| |
First step towards a BuildSDIV equivalent to D49248 for non-uniform vector support - this just pushes the splat detection down into TargetLowering::BuildSDIV where its still used.
Differential Revision: https://reviews.llvm.org/D50185
llvm-svn: 338838
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At least on ELF, it's impossible to tell from the object file whether
two globals with the same section marking were merged: the merged global
uses "private" linkage to hide its symbol, and the aliases look like
regular symbols. I can't think of any other reason to disallow it.
(Of course, we can only merge globals in the same section.)
The weird alignment handling matches AsmPrinter; our alignment handling
for global variables should probably be refactored.
Differential Revision: https://reviews.llvm.org/D49822
llvm-svn: 338791
|
| |
|
|
| |
llvm-svn: 338715
|
| |
|
|
|
|
|
|
|
|
| |
Value
This is logical continuation of https://reviews.llvm.org/D46018 (r332449)
Differential Revision: https://reviews.llvm.org/D49660
llvm-svn: 338685
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In expansion of FCOPYSIGN, the shift node is missing when the two
operands of FCOPYSIGN are of the same size. We should always generate
shift node (if the required shift bit is not zero) to put the sign
bit into the right position, regardless of the size of underlying
types.
Differential Revision: https://reviews.llvm.org/D49973
llvm-svn: 338665
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AArch64 ELF ABI does not define a static relocation type for TLS offset within
a module, which makes it impossible for compiler to generate a valid
DW_AT_location content for thread local variables. Currently LLVM generates an
invalid R_AARCH64_ABS64 relocation at the DW_AT_location field for a TLS
variable. That causes trouble for linker because thread local variable does
not have an absolute address at link time. AArch64 GCC solves the problem by
not generating DW_AT_location for thread local variables. We should do the
same in LLVM.
Differential Revision: https://reviews.llvm.org/D43860
llvm-svn: 338655
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added an option that allows to emit only '.loc' and '.file' kind debug
directives, but disables emission of the DWARF sections. Required for
NVPTX target to support profiling. It requires '.loc' and '.file'
directives, but does not require any DWARF sections for the profiler.
Reviewers: probinson, echristo, dblaikie
Subscribers: aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D46021
llvm-svn: 338616
|
| |
|
|
| |
llvm-svn: 338604
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug is visible in the constant-folded x86 tests. We can't use the
negated shift amount when the type is not power-of-2:
https://rise4fun.com/Alive/US1r
...so in that case, use the regular lowering that includes a select
to guard against a shift-by-bitwidth. This path is improved by only
calculating the modulo shift amount once now.
Also, improve the rotate (with power-of-2 size) lowering to use
a negate rather than subtract from bitwidth. This improves the
codegen whether we have a rotate instruction or not (although
we can still see that we're not matching to a legal rotate in
all cases).
llvm-svn: 338592
|
| |
|
|
|
|
|
|
| |
There is nothing x86-specific about this code, so it'd be nice to make this available for other targets to use in the future (and get it out of X86ISelLowering!).
Differential Revision: https://reviews.llvm.org/D50083
llvm-svn: 338586
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D49806
llvm-svn: 338562
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Getting the DWARF types section is only implemented for ELF object
files. We already disabled emitting debug types in clang (r337717), but
now we also report an fatal error (rather than crashing) when trying to
obtain this section in MC. Additionally we ignore the generate debug
types flag for unsupported target triples.
See PR38190 for more information.
Differential revision: https://reviews.llvm.org/D50057
llvm-svn: 338527
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This revision implements support for generating DWARFv5 .debug_addr section.
The implementation is pretty straight-forward: we just check the dwarf version
and emit section header if needed.
Reviewers: aprantl, dblaikie, probinson
Reviewed by: dblaikie
Differential Revision: https://reviews.llvm.org/D50005
llvm-svn: 338487
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Previously we were just visiting the blocks in the function in IR order, which
is rather arbitrary. Therefore we wouldn't always visit defs before uses, but
the translation code relies on this assumption in some places.
Only codegen change seen in tests is an elision of a redundant copy.
Fixes PR38396
llvm-svn: 338476
|