| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original patch caused crashes because it could derefence a null pointer
for SelectionDAGTargetInfo for targets that do not define it.
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
llvm-svn: 267328
|
| |
|
|
|
|
| |
cmove/ne+cttz/ctlz. These are folded by DAG combine now.
llvm-svn: 267326
|
| |
|
|
|
|
| |
string is just true or 1.
llvm-svn: 267324
|
| |
|
|
|
|
| |
text string which always evaluates to true. Add a ! so they'll evaluate to false.
llvm-svn: 267312
|
| |
|
|
|
|
| |
instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly.
llvm-svn: 267311
|
| |
|
|
|
|
|
|
|
| |
The option to control the emission of the new relocations
is -relax-relocations (blatantly copied from GNU as).
It can't be enabled by default because it breaks relatively
recent versions of ld.bfd/ld.gold (late 2015).
llvm-svn: 267307
|
| |
|
|
|
|
| |
In preparation for other changes.
llvm-svn: 267300
|
| |
|
|
|
|
| |
This reverts commit r267206, as it broke self-hosting on AArch64.
llvm-svn: 267294
|
| |
|
|
|
|
| |
convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC
llvm-svn: 267266
|
| |
|
|
|
|
| |
ctlz. Remove the now unneccessary isel patterns. NFC
llvm-svn: 267265
|
| |
|
|
|
|
| |
will convert them to ctlz/cttz. Remove the now unneccessary isel patterns. NFC
llvm-svn: 267264
|
| |
|
|
| |
llvm-svn: 267244
|
| |
|
|
|
|
| |
This fixes test regressions when i64 loads/stores are made promote.
llvm-svn: 267240
|
| |
|
|
| |
llvm-svn: 267229
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.
Also includes a bonus feature: addends for COFF image-relative references.
Differential Revision: http://reviews.llvm.org/D17938
llvm-svn: 267211
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The opcode for the optimized branch does not depend on the size
of the activate bits in the AND masks, but the AND opcode itself.
Indeed, we need to use a X or W variant based on the AND variant
not based on whether the mask fits into the related variant.
Otherwise, we may end up using the W variant of the optimized branch
for 64-bit register inputs!
This fixes the last make check verifier issues for AArch64: PR27479.
llvm-svn: 267206
|
| |
|
|
|
|
|
|
| |
declared as a definition.
This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll.
llvm-svn: 267185
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to simply set the kill flags to true when transforming a scalar
instruction to a vector one.
SrcScalar1 = copy SrcVector1
... = opScalar SrcScalar1
=>
SrcScalar1 = copy SrcVector1
... = opVector SrcVector1<kill>
This is obviously wrong. The proper update consists in:
1. Propagate the kill status from the copy to the new opVector
2. Reset the kill status on the copy, since the live-range of
SrcVector1 got extended.
This fixes some of the machine verifier errors for AArch64 with make check.
llvm-svn: 267180
|
| |
|
|
| |
llvm-svn: 267178
|
| |
|
|
| |
llvm-svn: 267173
|
| |
|
|
|
|
|
|
|
|
|
| |
- Switch few loops to range-based for loops
- Fix nop insertion at the end of BB
- Fix formatting
- Check for endpgm
Differential Revision: http://reviews.llvm.org/D19380
llvm-svn: 267167
|
| |
|
|
|
|
| |
Commit r266861 was the reason for failing tests in LLVM test suite.
llvm-svn: 267166
|
| |
|
|
| |
llvm-svn: 267165
|
| |
|
|
|
|
| |
Also add tests for other instructions from HexagonSystemInst.td.
llvm-svn: 267162
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When generating assembly using -m16 we must explicitly mark it as
16-bit. Emit .code16 at beginning of file. Fixes wrong results when
using -fno-integrated-as.
Reviewers: dwmw2
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19392
llvm-svn: 267152
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When targetting MIPS64R6 some of the patterns for select were guarded by a
broken predicate. The predicate was supposed to test if a constant value
could fit in a 16 bit zero-extended field. Instead the value was tested to
fit in a 16 bit sign-extended field. For negative constants of native word
width this resulted in wrong code generation.
Reviewers: vkalintiris, dsanders
Differential Review: http://reviews.llvm.org/D19378
llvm-svn: 267151
|
| |
|
|
| |
llvm-svn: 267149
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D19354
llvm-svn: 267137
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15026
llvm-svn: 267130
|
| |
|
|
|
|
| |
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.
llvm-svn: 267127
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS
operations during DAG combine. This generates better code for truncate, so cost
of truncate needs to be changed but looks like it got changed only in SSE2 table
Whereas this change is also applicable for SSE4.1, so the cost of truncate needs
to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE
v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from
SSE4.1, so it will fall back to SSE2.
Reviewers: Simon Pilgrim
llvm-svn: 267123
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Specifically, itineraries for LEON processors has been added, along with several LEON processor Subtargets. Although currently all these targets are pretty much identical, support for features that will differ among these processors will be added in the very near future.
The different Instruction Itinerary Classes (IICs) added are sufficient to differentiate between the instruction timings used by LEON and, quite probably, by generic Sparc processors too, but the focus of the exercise has been for LEON processors, as the requirement of my project. If the IICs are not sufficient for other Sparc processor types and you want to add a new itinerary for one of those, it should be relatively trivial to adapt this.
As none of the LEON processors has Quad Floats, or is a Version 9 processor, none of those instructions have itinerary classes defined and revert to the default "NoItinerary" instruction itinerary.
Phabricator Review: http://reviews.llvm.org/D19359
llvm-svn: 267121
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
take the address of a global object:
void func1() {
...
}
int main(int argc, char** argv) {
void (*pFunc)();
pFunc = &func1
pFunc();
...
}
Phabricator review: http://reviews.llvm.org/D19368
llvm-svn: 267120
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D18687
llvm-svn: 267114
|
| |
|
|
| |
llvm-svn: 267112
|
| |
|
|
|
|
| |
since the custom logic just did what Expand does when CTTZ/CTLZ are Legal. NFC
llvm-svn: 267109
|
| |
|
|
|
|
| |
they will be converted to CTLZ/CTTZ by LegalizeDAG. Remove extra instructions that only existed to to contain patterns that match the zero_undef operations. NFC
llvm-svn: 267108
|
| |
|
|
| |
llvm-svn: 267107
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.
Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.
This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19191
llvm-svn: 267102
|
| |
|
|
|
|
| |
CTTZ_ZERO_UNDEF even without VLX support. We can just extend to 512-bits and extract like we do for CTLZ.
llvm-svn: 267100
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:
- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math
llvm-svn: 267098
|
| |
|
|
|
|
| |
This follows the current binary format rules.
llvm-svn: 267082
|
| |
|
|
|
|
|
|
|
|
|
| |
WIN__DBZCHK will insert a CBZ instruction into the stream. This instruction
reserves 3 bits for the condition register (rn). As such, we must ensure that
we restrict the register to a low register. Use the tGPR class instead of GPR
to ensure that this is properly constrained. In debug builds, we would attempt
to use lr as a condition register which would silently get truncated with no
hint that the register selection was incorrect.
llvm-svn: 267080
|
| |
|
|
|
|
|
|
|
|
| |
We'd disabled them on x86 because back in the early days some host tools
couldn't handle the new load commands. This no longer holds: anyone capable of
deploying Clang should be able to deploy its copies of ar/ranlib/etc.
rdar://25254790
llvm-svn: 267075
|
| |
|
|
| |
llvm-svn: 267038
|
| |
|
|
| |
llvm-svn: 267034
|
| |
|
|
|
|
|
| |
Return bool instead of void so that it is natural to put the calls into
asserts.
llvm-svn: 267033
|
| |
|
|
|
|
| |
I get this wrong every time I try to debug this.
llvm-svn: 267030
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
IntrReadWriteArgMem simply becomes IntrArgMemOnly.
So there are fewer intrinsic properties that express their orthogonality
better, and correspond more closely to the corresponding IR attributes.
Suggested by: Philip Reames
Reviewers: joker.eph, reames, tstellarAMD
Subscribers: jholewinski, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19291
llvm-svn: 267021
|
| |
|
|
|
|
|
|
| |
r266809 incorrectly used LD to load the stack guard, it should be LWZ.
Differential Revision: http://reviews.llvm.org/D19358
llvm-svn: 267017
|