| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 238501
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For x86 targets, do not do sibling call optimization when materializing
the callee's address would require a GOT relocation. We can still do
tail calls to internal functions, hidden functions, and protected
functions, because they do not require this kind of relocation. It is
still possible to get GOT relocations when the user explicitly asks for
it with musttail or -tailcallopt, both of which are supposed to
guarantee TCO.
Based on a patch by Chih-hung Hsieh.
Reviewers: srhines, timmurray, danalbert, enh, void, nadav, rnk
Subscribers: joerg, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D9799
llvm-svn: 238487
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were previously codegen'ing these as regular load/store operations and
hoping that the register allocator would allocate registers in ascending order
so that we could apply an LDM/STM combine after register allocation. According
to the commit that first introduced this code (r37179), we planned to teach
the register allocator to allocate the registers in ascending order. This
never got implemented, and up to now we've been stuck with very poor codegen.
A much simpler approach for achiveing better codegen is to create LDM/STM
instructions with identical sets of virtual registers, let the register
allocator pick arbitrary registers and order register lists when printing an
MCInst. This approach also avoids the need to repeatedly calculate offsets
which ultimately ought to be eliminated pre-RA in order to decrease register
pressure.
This is implemented by lowering the memcpy intrinsic to a series of SD-only
MCOPY pseudo-instructions which performs a memory copy using a given number
of registers. During SD->MI lowering, we lower MCOPY to LDM/STM. This is a
little unusual, but it avoids the need to encode register lists in the SD,
and we can take advantage of SD use lists to decide whether to use the _UPD
variant of the instructions.
Fixes PR9199.
Differential Revision: http://reviews.llvm.org/D9508
llvm-svn: 238473
|
|
|
|
| |
llvm-svn: 238472
|
|
|
|
| |
llvm-svn: 238448
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Octeon CPUs use dmtc2 rt,imm16 and dmfcp2 rt,imm16 for the crypto coprocessor.
E.g. dmtc2 rt,0x4057 starts calculation of sha-1.
I had to introduce a new deconding namespace to avoid a decoding conflict.
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D10083
llvm-svn: 238439
|
|
|
|
|
|
|
|
|
|
| |
Add support for resolving MIPS64r2 and MIPS64r6 relocations in MCJIT.
Patch by Vladimir Radosavljevic.
Differential Revision: http://reviews.llvm.org/D9667
llvm-svn: 238424
|
|
|
|
|
|
| |
Creating temporary std::strings there is unnecessary.
llvm-svn: 238412
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch made two improvements to NaryReassociate and the NVPTX pipeline
1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common
expressions.
2. When adding an instruction to SeenExprs, maps both the SCEV before and after
reassociation to that instruction.
Test Plan: updated @reassociate_gep_nsw in nary-gep.ll
Reviewers: meheff, broune
Reviewed By: broune
Subscribers: dberlin, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D9947
llvm-svn: 238396
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that most of the methods in Clang and LLVM that were parsing arch/cpu/fpu
strings are using ARMTargetParser, it's time to make it a bit more conforming
with what the ABI says.
This commit adds some clarification on what build attributes are accepted and
which are "non-standard". It also makes clear that the "defaultCPU" and
"defaultArch" methods were really just build attribute getters.
It also diverges from GCC's behaviour to say that armv2/armv3 are really an
ARMv4 in the build attributes, when the ABI has a clear state for that: Pre-v4.
llvm-svn: 238344
|
|
|
|
|
| |
reviewer: tstellardAMD
llvm-svn: 238337
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D9739
llvm-svn: 238333
|
|
|
|
|
|
|
|
|
|
|
| |
and BNEZALC instructions
This patch implements microMIPS32r6 BEQZALC, BGEZALC, BGTZALC, BLEZALC, BLTZALC
and BNEZALC instructions using mapping.
Differential Revision: http://reviews.llvm.org/D10031
llvm-svn: 238325
|
|
|
|
|
|
| |
By Igor Breger (igor.breger@intel.com)
llvm-svn: 238322
|
|
|
|
| |
llvm-svn: 238315
|
|
|
|
|
|
|
|
|
|
| |
.eh_frame to be read-only.
This broke the llvm-mips-linux builder and several of our out-of-tree builders.
Initial investigations show that the commit probably isn't the problem but
reverting anyway while I investigate.
llvm-svn: 238302
|
|
|
|
|
|
|
|
|
|
|
| |
for KNL and SKX
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.
By Igor Breger (igor.breger@intel.com)
llvm-svn: 238301
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch the x86 backend is now shrink-wrapping capable
and this functionality can be tested by using the
-enable-shrink-wrap switch.
The next step is to make more test and enable shrink-wrapping by
default for x86.
Related to <rdar://problem/20821487>
llvm-svn: 238293
|
|
|
|
| |
llvm-svn: 238289
|
|
|
|
|
|
|
|
|
|
|
| |
This gets gas and llc -filetype=obj to agree on the order of prefixes.
For llvm-mc we need to fix the asm parser to know that it makes a difference
on which line the "lock" is in.
Part of pr23594.
llvm-svn: 238232
|
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 238229
|
|
|
|
|
|
|
|
| |
v2: Use C++ comments and end with periods
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 238228
|
|
|
|
|
|
|
| |
This reverts commit r238201 to fix linking problems in x86 Linux
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html
llvm-svn: 238223
|
|
|
|
| |
llvm-svn: 238211
|
|
|
|
|
|
| |
There is now no SectionData to be created.
llvm-svn: 238208
|
|
|
|
| |
llvm-svn: 238201
|
|
|
|
| |
llvm-svn: 238199
|
|
|
|
|
|
| |
https://llvm.org/bugs/show_bug.cgi?id=23630
llvm-svn: 238198
|
|
|
|
|
|
| |
https://llvm.org/bugs/show_bug.cgi?id=23634
llvm-svn: 238195
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, subtarget features were a bitfield with the underlying type being uint64_t.
Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset.
No functional change.
The first several times this was committed (e.g. r229831, r233055), it caused several buildbot failures.
Apparently the reason for most failures was both clang and gcc's inability to deal with large numbers (> 10K) of bitset constructor calls in tablegen-generated initializers of instruction info tables.
This should now be fixed.
llvm-svn: 238192
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Following on from r209907 which made personality encodings indirect, do the
same for TType encodings. This fixes the case where a try/catch block needs
to generate references to, for example, std::exception in the
.gcc_except_table.
This commit uses DW_EH_PE_sdata8 for N64 as far as is possible at the moment.
However, it is possible to end up with DW_EH_PE_sdata4 when a TargetMachine is
not available. There's no risk of issues with inconsistency here since the
tables are self describing but it does mean there is a small chance of the
PC-relative offset being out of range for particularly large programs.
Reviewers: petarj
Reviewed By: petarj
Subscribers: srhines, joerg, tberghammer, llvm-commits
Differential Revision: http://reviews.llvm.org/D9669
llvm-svn: 238190
|
|
|
|
| |
llvm-svn: 238172
|
|
|
|
| |
llvm-svn: 238165
|
|
|
|
| |
llvm-svn: 238163
|
|
|
|
|
|
| |
Another step in merging MCSectionData and MCSection.
llvm-svn: 238162
|
|
|
|
|
|
|
| |
This also changes MCAssembler to store a vector of MCSections instead of an
iplist of MCSectionData.
llvm-svn: 238159
|
|
|
|
|
|
|
|
| |
Part of D9474, this patch extends AVX2 v16i16 types to 2 x 8i32 vectors and uses i32 shift variable shifts before packing back to i16.
Adds AVX2 tests for v8i16 and v16i16
llvm-svn: 238149
|
|
|
|
|
|
|
| |
DisableEncoding and Constraints can be set using let statements around
the multiclass defs.
llvm-svn: 238148
|
|
|
|
|
|
| |
The src and dst register cannot be the same on chips with 16 lds banks.
llvm-svn: 238147
|
|
|
|
|
|
|
|
| |
This lets us drop a parameter the opName parameter to the VINTRP
multiclass and makes it possible to create multiple VINTRP defs
with the same asm mnemonic.
llvm-svn: 238146
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in POWER8:
vadduqm
vaddeuqm
vaddcuq
vaddecuq
vsubuqm
vsubeuqm
vsubcuq
vsubecuq
In addition to adding the instructions themselves, it also adds support for the
v1i128 type for intrinsics (Intrinsics.td, Function.cpp, and
IntrinsicEmitter.cpp).
http://reviews.llvm.org/D9081
llvm-svn: 238144
|
|
|
|
| |
llvm-svn: 238139
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
first and second operands.
The semantics of the scalar FMA intrinsics are that the high vector elements are copied from the first source.
The existing pattern switches src1 and src2 around, to match the "213" order, which ends up tying the original src2 to the dest. Since the actual scalar fma3 instructions copy the high elements from the dest register, the wrong values are copied.
This modifies the pattern to leave src1 and src2 in their original order.
Differential Revision: http://reviews.llvm.org/D9908
llvm-svn: 238131
|
|
|
|
|
|
|
| |
I encountered with this case in one of KNL tests for i1 vectors.
v16i1 = EXTRACT_SUBVECTOR v32i1, x
llvm-svn: 238130
|
|
|
|
| |
llvm-svn: 238126
|
|
|
|
| |
llvm-svn: 238125
|
|
|
|
|
|
|
|
|
|
| |
On GPU targets, materializing constants is cheap and stores are
expensive, so only doing this for zero vectors was silly.
Most of the new testcases aren't optimally merged, and are for
later improvements.
llvm-svn: 238108
|
|
|
|
|
|
|
|
| |
allocation.
NFC.
llvm-svn: 238104
|
|
|
|
|
|
| |
NFC.
llvm-svn: 238103
|
|
|
|
| |
llvm-svn: 238102
|