| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
| |
Because of how Clang represents structs as arrays (at least on non-Darwin
platforms), and what SROA does, etc. this is no longer a problem.
llvm-svn: 225251
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
"ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF
relocation target a movq or addq instruction.
Prohibit the truncation of such loads to movl or addl.
This fixes PR22083.
Differential Revision: http://reviews.llvm.org/D6839
llvm-svn: 225250
|
| |
|
|
|
|
| |
These are used for debugging output; NFC.
llvm-svn: 225249
|
| |
|
|
|
|
|
|
|
| |
The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x))
without a load/store pair is updated here with support for unsigned integers,
and to support single-precision values without a third rounding step, on newer
cores with the appropriate instructions.
llvm-svn: 225248
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a specific analysis result.
This is quite handy to test things, and will also likely be very useful
for debugging issues. You could narrow down pass validation failures by
walking these invalidate pass runs up and down the pass pipeline, etc.
I've added support to the pass pipeline parsing to be able to create one
of these for any analysis pass desired.
Just adding this class uncovered one latent bug where the
AnalysisManager CRTP base class had a hard-coded Module type rather than
using IRUnitT.
I've also added tests for invalidation and caching of analyses in
a basic way across all the pass managers. These in turn uncovered two
more bugs where we failed to correctly invalidate an analysis -- its
results were invalidated but the key for re-running the pass was never
cleared and so it was never re-run. Quite nasty. I'm very glad to debug
this here rather than with a full system.
Also, yes, the naming here is horrid. I'm going to update some of the
names to be slightly less awful shortly. But really, I've no "good"
ideas for naming. I'll be satisfied if I can get it to "not bad".
llvm-svn: 225246
|
| |
|
|
|
|
| |
We always select the 8-bit size and let the assembler backend relax to the larger size.
llvm-svn: 225243
|
| |
|
|
|
|
|
|
| |
long form.
The assembler backend will relax to the long form if necessary. This removes a swap from long form to short form in the MCInstLowering code. Selecting the long form used to be required by the old JIT.
llvm-svn: 225242
|
| |
|
|
|
|
|
|
|
|
|
|
| |
manager tests to use them and be significantly more comprehensive.
This, naturally, uncovered a bug where the CGSCC pass manager wasn't
printing analyses when they were run.
The only remaining core manipulator is I think an invalidate pass
similar to the require pass. That'll be next. =]
llvm-svn: 225240
|
| |
|
|
| |
llvm-svn: 225233
|
| |
|
|
| |
llvm-svn: 225232
|
| |
|
|
| |
llvm-svn: 225231
|
| |
|
|
|
|
| |
I've filed http://llvm.org/PR22100 to track this issue.
llvm-svn: 225228
|
| |
|
|
| |
llvm-svn: 225227
|
| |
|
|
|
|
|
|
|
|
| |
Now that `LLVMContextImpl` can call `MDNode::dropAllReferences()` to
prevent teardown madness, stop dropping uniquing just because an operand
drops to null.
Part of PR21532.
llvm-svn: 225223
|
| |
|
|
|
|
|
|
|
| |
This reverts commit r225213. It's failing on multiple buildbots [1][2].
[1]: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/22032
[2]: http://lab.llvm.org:8080/green/view/Clang/job/clang-stage1-cmake-RA-incremental_check/2357/
llvm-svn: 225222
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We no longer generate horrible code for the stated function:
void f(signed char *a, _Bool b, _Bool c) {
signed char t = 0;
if (b) t = *a;
if (c) *a = t;
}
for which we now generate:
.L.f:
andi. 5, 5, 1
cmpldi 1, 4, 0
li 5, 0
beq 1, .LBB0_2
lbz 5, 0(3)
.LBB0_2: # %if.end
bclr 4, 1, 0
stb 5, 0(3)
blr
so we don't need the README.txt entry.
llvm-svn: 225217
|
| |
|
|
|
|
| |
Removed local isSequential predicate and use standard helper isSequentialOrUndefInRange instead.
llvm-svn: 225216
|
| |
|
|
|
|
|
| |
We now produce the desired code as noted in the README.txt file (no spurious
or). Remove the README entry and improve the regression test.
llvm-svn: 225214
|
| |
|
|
| |
llvm-svn: 225213
|
| |
|
|
|
|
|
| |
This entry has been rendered irrelevant now that we have proper CR bit
tracking.
llvm-svn: 225211
|
| |
|
|
|
|
| |
memop instructions. Removing old defs without bits and updating references.
llvm-svn: 225210
|
| |
|
|
|
|
|
| |
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.
llvm-svn: 225209
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
dsymutil would like to use all the AsmPrinter/MCStreamer infrastructure
to stream out the DWARF. In order to do so, it will reuse the DIE object
and so this header needs to be public.
The interface exposed here has some corners that cannot be used without a
DwarfDebug object, but clients that want to stream Dwarf can just avoid
these.
Differential Revision: http://reviews.llvm.org/D6695
llvm-svn: 225208
|
| |
|
|
|
|
|
| |
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.
llvm-svn: 225205
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider this function from our README.txt file:
int foo(int a, int b) { return (a < b) << 4; }
We now explicitly track CR bits by default, so the comment in the README.txt
about not really having a SETCC is no longer accurate, but we did generate this
somewhat silly code:
cmpw 0, 3, 4
li 3, 0
li 12, 1
isel 3, 12, 3, 0
sldi 3, 3, 4
blr
which generates the zext as a select between 0 and 1, and then shifts the
result by a constant amount. Here we preprocess the DAG in order to fold the
results of operations on an extension of an i1 value into the SELECT_I[48]
pseudo instruction when the resulting constant can be materialized using one
instruction (just like the 0 and 1). This was not implemented as a DAGCombine
because the resulting code would have been anti-canonical and depends on
replacing chained user nodes, which does not fit well into the lowering
paradigm. Now we generate:
cmpw 0, 3, 4
li 3, 0
li 12, 16
isel 3, 12, 3, 0
blr
which is less silly.
llvm-svn: 225203
|
| |
|
|
| |
llvm-svn: 225202
|
| |
|
|
|
|
| |
accumulating shifts.
llvm-svn: 225201
|
| |
|
|
|
|
| |
`LLVMContext` isn't actually used.
llvm-svn: 225200
|
| |
|
|
|
|
| |
encoding bits.
llvm-svn: 225199
|
| |
|
|
| |
llvm-svn: 225198
|
| |
|
|
| |
llvm-svn: 225197
|
| |
|
|
|
|
|
|
|
| |
The 64-bit semantics of cntlzw are not special, the 32-bit population count is
stored as a 64-bit value in the range [0,32]. As a result, it is always zero
extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a
frontier instruction for the removal of unnecessary zero extensions.
llvm-svn: 225192
|
| |
|
|
|
|
|
|
|
| |
lhbrx and lwbrx not only load their data with byte swapping, but also clear the
upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG
peephole optimization as frontier instructions for the removal of unnecessary
zero extensions.
llvm-svn: 225189
|
| |
|
|
| |
llvm-svn: 225188
|
| |
|
|
|
|
|
|
| |
The swap implementation for iplist is currently unsupported. Simply splice the
old list into place, which achieves the same purpose. This is needed in order
to thread the -frewrite-map-file frontend option correctly. NFC.
llvm-svn: 225186
|
| |
|
|
|
|
| |
Wrap a couple of lines. NFC.
llvm-svn: 225185
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to generate code similar to:
umov.b w8, v0[2]
strb w8, [x0, x1]
because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:
add x8, x0, x1
st1.b { v0 }[2], [x8]
This patch increases the ST1* AddedComplexity to achieve that.
rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202
llvm-svn: 225183
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the subregister.
For 0-lane stores, we used to generate code similar to:
fmov w8, s0
str w8, [x0, x1, lsl #2]
instead of:
str s0, [x0, x1, lsl #2]
To correct that: for store lane 0 patterns, directly match to STR <subreg>0.
Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.
rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772
llvm-svn: 225181
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch lowers patterns such as-
fsub v0.4s, v0.4s, v1.4s
fabs v0.4s, v0.4s
to
fabd v0.4s, v0.4s, v1.4s
on AArch64.
Review: http://reviews.llvm.org/D6791
llvm-svn: 225169
|
| |
|
|
|
|
|
|
| |
Tag_compatibility takes two arguments, but before this patch it would
erroneously accept just one, it now produces an error in that case.
Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d
llvm-svn: 225167
|
| |
|
|
|
|
|
|
|
|
|
| |
Claim conformance to version 2.09 of the ARM ABI.
This build attribute must be emitted first amongst the build attributes when
written to an object file. This is to simplify conformance detection by
consumers.
Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e
llvm-svn: 225166
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch lowers patterns such as-
sub v0.4s, v0.4s, v1.4s
abs v0.4s, v0.4s
to
sabd v0.4s, v0.4s, v1.4s
on AArch64.
Review: http://reviews.llvm.org/D6781
llvm-svn: 225165
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when all are being preserved.
We want to short-circuit this for a couple of reasons. One, I don't
really want passes to grow a dependency on actually receiving their
invalidate call when they've been preserved. I'm thinking about removing
this entirely. But more importantly, preserving everything is likely to
be the common case in a lot of scenarios, and it would be really good to
bypass all of the invalidation and preservation machinery there.
Avoiding calling N opaque functions to try to invalidate things that are
by definition still valid seems important. =]
This wasn't really inpsired by much other than seeing the spam in the
logging for analyses, but it seems better ot get it checked in rather
than forgetting about it.
llvm-svn: 225163
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
manager.
This starts to allow us to test analyses more easily, but it's really
only the beginning. Some of the code here is still untestable without
manual changes to create analysis passes, but I wanted to factor it into
a small of chunks as possible.
Next up in order to be able to test things are, in no particular order:
- No-op analyses passes so we don't have to use real ones to exercise
the pass maneger itself.
- Automatic way of generating dummy passes that require an analysis be
run, including a variant that calls a 'print' method on a pass to make
it even easier to print out the results of an analysis.
- Dummy passes that invalidate all analyses for their IR unit so we can
test invalidation and re-runs.
- Automatic way to print each analysis pass as it is re-run.
- Automatic but optional verification of analysis passes everywhere
possible.
I'm not claiming I'll get to all of these immediately, but that's what
is in the pipeline at some stage. I'm fleshing out exactly what I need
and what to prioritize by working on converting analyses and then trying
to test the conversion. =]
llvm-svn: 225162
|
| |
|
|
|
|
| |
into the assert.
llvm-svn: 225160
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
following loop should not be vectorized with current algorithm.
{code}
// loop body
... = a[i] (1)
... = a[i+1] (2)
.......
a[i+1] = .... (3)
a[i] = ... (4)
{code}
The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed.
For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized.
The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly.
llvm-svn: 225159
|
| |
|
|
|
|
| |
dec instructions. Remove the 32-bit mode only versions that existed for the disassembler. Move the patterns out of the instructions so they can still be qualified with predicates.
llvm-svn: 225157
|
| |
|
|
|
|
| |
incrementing. NFC
llvm-svn: 225156
|
| |
|
|
|
|
| |
assignment as the beginning of the function. NFC.
llvm-svn: 225155
|
| |
|
|
|
|
| |
since the code is in a comment. Can't figure out what the body of the 'if' was supposed to be anyway.
llvm-svn: 225154
|