| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a re-commit of r190764, with an extra check to make sure that we're not
performing the transformation on illegal types (a small test case has been
added for this as well).
Original commit message:
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.
llvm-svn: 190771
|
|
|
|
| |
llvm-svn: 190770
|
|
|
|
| |
llvm-svn: 190769
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The '?' flag uses the last section group if the last had a section
group. We treat combining an explicit section group and the '?' as a
hard error.
This fixes PR17198.
Reviewers: rafael, bkramer
Reviewed By: bkramer
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1686
llvm-svn: 190768
|
|
|
|
|
|
|
|
|
|
|
| |
For alignment purposes, the instruction array will always have an even
number of entries, with the final entry potentially unused (in which
case the array will be one longer than indicated by the count of unwind
codes field).
Reviewed by Anton Korobeynikov, Charles Davis and Nico Rieck.
llvm-svn: 190767
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
data structures.
The Win64 EH data structures must be of type IMAGE_REL_AMD64_ADDR32NB
instead of IMAGE_REL_AMD64_ADDR32. This is easiely achieved by adding
the VK_COFF_IMGREL32 modifier to the symbol reference.
Change also references to start and end of the SEH range of a function
as offsets to start of the function.
Reviewed by Jim Grosbach, Charles Davis and Nico Rieck.
llvm-svn: 190766
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is causing test-suite failures.
Original commit message:
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.
llvm-svn: 190765
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.
llvm-svn: 190764
|
|
|
|
|
|
|
| |
DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we
can't use AA in this case (if we try, then the casting code in AA will assert).
llvm-svn: 190763
|
|
|
|
|
|
|
| |
so it can be better used for general interoperability testing between mips32
and mips16.
llvm-svn: 190762
|
|
|
|
| |
llvm-svn: 190759
|
|
|
|
|
|
|
| |
Also assembly/disassembly tests, and for sha256rnds2, aliases with an explicit
xmm0 dependency.
llvm-svn: 190754
|
|
|
|
| |
llvm-svn: 190750
|
|
|
|
| |
llvm-svn: 190749
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass was based on the previous (essentially unused) profiling
infrastructure and the assumption that by ordering the basic blocks at
the IR level in a particular way, the correct layout would happen in the
end. This sometimes worked, and mostly didn't. It also was a really
naive implementation of the classical paper that dates from when branch
predictors were primarily directional and when loop structure wasn't
commonly available. It also didn't factor into the equation
non-fallthrough branches and other machine level details.
Anyways, for all of these reasons and more, I wrote
MachineBlockPlacement, which completely supercedes this pass. It both
uses modern profile information infrastructure, and actually works. =]
llvm-svn: 190748
|
|
|
|
| |
llvm-svn: 190746
|
|
|
|
| |
llvm-svn: 190745
|
|
|
|
| |
llvm-svn: 190744
|
|
|
|
|
|
|
|
|
|
| |
This was somewhat tricky because ~PrettyStackTraceEntry() may run after
llvm_shutdown() has been called. This is rare and only happens for a common idiom
used in the main() functions of command-line tools. This works around the idiom by
skipping the stack clean-up if the PrettyStackTraceHead ManagedStatic is not
constructed (i.e. llvm_shutdown() has been called).
llvm-svn: 190730
|
|
|
|
|
|
| |
As it turns out, not a problem in practice, but it should be there.
llvm-svn: 190720
|
|
|
|
|
|
|
|
|
|
|
| |
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.
Auto detects SLM.
Turns on post RA scheduler when generating code for SLM.
llvm-svn: 190717
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By definition copies across register banks are not coalescable. Still, it may be
possible to get rid of such a copy when the value is available in another
register of the same register file.
Consider the following example, where capital and lower letters denote different
register file:
b = copy A <-- cross-bank copy
...
C = copy b <-- cross-bank copy
This could have been optimized this way:
b = copy A <-- cross-bank copy
...
C = copy A <-- same-bank copy
Note: b and C's definitions may be in different basic blocks.
This patch adds a peephole optimization that looks through a chain of copies
leading to a cross-bank copy and reuses a source that is on the same register
file if available.
This solution could also be used to get rid of some copies (e.g., A could have
been used instead of C). However, we do not do so because:
- It may over constrain the coloring of the source register for coalescing.
- The register allocator may not be able to find a nice split point for the
longer live-range, leading to more spill.
<rdar://problem/14742333>
llvm-svn: 190713
|
|
|
|
|
|
| |
to be more consistent.
llvm-svn: 190692
|
|
|
|
|
|
| |
Compiler part.
llvm-svn: 190689
|
|
|
|
|
|
| |
Patch by Bradley Smith!
llvm-svn: 190683
|
|
|
|
| |
llvm-svn: 190676
|
|
|
|
|
|
| |
Just a clean-up, no behavioral change intended.
llvm-svn: 190673
|
|
|
|
|
|
| |
E.g. "SRL %r2, 2; TMLL %r2, 1" => "TMLL %r2, 4".
llvm-svn: 190672
|
|
|
|
|
|
| |
disabled.
llvm-svn: 190668
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we modelled VPR128 and VPR64 as essentially identical
register-classes containing V0-V31 (which had Q0-Q31 as "sub_alias"
sub-registers). This model is starting to cause significant problems
for code generation, particularly writing EXTRACT/INSERT_SUBREG
patterns for converting between the two.
The change here switches to classifying VPR64 & VPR128 as
RegisterOperands, which are essentially aliases for RegisterClasses
with different parsing and printing behaviour. This fits almost
exactly with their real status (VPR128 == FPR128 printed strangely,
VPR64 == FPR64 printed strangely).
llvm-svn: 190665
|
|
|
|
| |
llvm-svn: 190659
|
|
|
|
|
|
|
|
| |
versions of gold. This support is designed to allow gold to produce
gdb_index sections similar to the accelerator tables and consumable
by gdb.
llvm-svn: 190649
|
|
|
|
| |
llvm-svn: 190648
|
|
|
|
| |
llvm-svn: 190645
|
|
|
|
| |
llvm-svn: 190644
|
|
|
|
|
|
|
| |
This move makes possible to correctly handle multiples instructions
from a single pattern.
llvm-svn: 190643
|
|
|
|
| |
llvm-svn: 190640
|
|
|
|
|
|
|
|
|
|
| |
When a structure is passed by value, and that structure contains a vector
member, according to the PPC ABI, the structure will receive enhanced alignment
(so that the vector within the structure will always be aligned).
This should resolve PR16641.
llvm-svn: 190636
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by Joe Abbey and Tobias Grosser
Here is a patch that fixes decoding of CE_SELECT in BitcodeReader,
along with a simple test case. The problem in the current code is that
it generates but doesn't accept bitcode that uses vectors for the
first element of a select in this context.
llvm-svn: 190634
|
|
|
|
|
|
| |
a volatile load, or a volatile store.
llvm-svn: 190631
|
|
|
|
|
|
|
|
|
|
|
| |
In fast-math mode sqrt(x) is calculated using the fast expansion of the
reciprocal of the reciprocal sqrt expansion. The reciprocal and reciprocal
sqrt expansions use the associated estimate instructions along with some Newton
iterations. Unfortunately, as a result, sqrt(0) was being calculated as NaN,
which is not correct. Now we explicitly return a result of zero if the input is
zero.
llvm-svn: 190624
|
|
|
|
|
|
| |
FreeBSD kernel.
llvm-svn: 190618
|
|
|
|
|
|
|
|
|
|
| |
Mutex and
global ThreadLocals, thereby getting rid of the load-time initialization of those
objects and also getting rid of their destruction unless the LLVM client calls
llvm_shutdown.
llvm-svn: 190617
|
|
|
|
|
|
|
|
|
| |
Add basic assembly/disassembly support for the first Intel SHA
instruction 'sha1rnds4'. Also includes feature flag, and test cases.
Support for the remaining instructions will follow in a separate patch.
llvm-svn: 190611
|
|
|
|
|
|
|
|
| |
Use the new instruction deprecation feature to mark mftb (now replaced with
mfspr) and dst (along with the other Altivec cache control instructions) as
deprecated when targeting cores supporting at least ISA v2.03.
llvm-svn: 190605
|
|
|
|
|
|
|
|
| |
undef constatnt for structure and test for these functions.
done by Yuri Veselov (mailto:Yuri.Veselov@intel.com)
llvm-svn: 190599
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.
The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
ComplexDeprecationPredicate<"MCR">
would mean you would have to define the following function:
bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
std::string &Info)
Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.
The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.
llvm-svn: 190598
|
|
|
|
|
|
| |
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}.
llvm-svn: 190595
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aggressive anti-dependency breaking is enabled by default for all PPC cores.
This provides a general speedup on the P7 and other platforms (among other
factors, the instruction group formation for the non-embedded PPC cores is done
during post-RA scheduling). In order to do this safely, the incompatibility
between uses of the MFOCRF instruction and anti-dependency breaking are
resolved by marking MFOCRF with hasExtraSrcRegAllocReq. As noted in the removed
FIXME, the problem was that MFOCRF's output is sensitive to the identify of the
source register, and always paired with a shift to undo this effect. Because
anti-dependency breaking is unaware of this hidden dependency of the shift
amount on the source register of the MFOCRF instruction, changing that register
must be inhibited.
Two test cases were adjusted: The SjLj test was made more insensitive to
register choices and scheduling; the saveCR test disabled anti-dependency
breaking because part of what it is testing is proper register reuse.
llvm-svn: 190587
|
|
|
|
|
|
|
|
|
|
|
|
| |
If no register classes are added to CriticalPathRCs, then the CriticalPathSet
bitmask will be empty. In that case, ExcludeRegs must remain NULL or else this
line will cause a segfault:
} else if ((ExcludeRegs != NULL) && ExcludeRegs->test(AntiDepReg)) {
I have no in-tree test case.
llvm-svn: 190584
|