| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
Haven't thought that I ever needed to do this, but in this case comparing the
index using pointer arithmetic turns out to be really ugly. It also generates
nasty sign-compare warnings on 32 bit targets.
llvm-svn: 214705
|
| |
|
|
| |
llvm-svn: 214704
|
| |
|
|
| |
llvm-svn: 214703
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
patterns of v16i8 shuffles.
This implements one of the more important FIXMEs for the SSE2 support in
the new shuffle lowering. We now generate the optimal shuffle sequence
for truncate-derived shuffles which show up essentially everywhere.
Unfortunately, this exposes a weakness in other parts of the shuffle
logic -- we can no longer form PSHUFB here. I'll add the necessary
support for that and other things in a subsequent commit.
llvm-svn: 214702
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
definition is visible.
Summary:
This allows us to copy the parameter name from the definition (as a comment)
or insert /*unused*/ in both places.
Reviewers: alexfh, klimek
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D4772
llvm-svn: 214701
|
| |
|
|
| |
llvm-svn: 214700
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CXXNameMangler::mangleUnqualifiedBlock believed that
MangleContext::getBlockId returned something that used Itanium-style
discriminator numbers.
Discriminator numbers start their numberign from 1 and the first
mangling that actually gets any sort of number mangled in is the second
discriminator.
However, Block IDs start from zero. The logic for omitting the mangling
number did a ' > 1' instead of a ' > 0' comparison; this could
potentially cause mangling conflicts.
llvm-svn: 214699
|
| |
|
|
|
|
|
| |
A typedef of a typedef should have AlignIsRequired if *either* typedef
has an AlignAttr attached to it.
llvm-svn: 214698
|
| |
|
|
|
|
| |
This commit broke "make check" for several hours, so get it reverted.
llvm-svn: 214697
|
| |
|
|
|
|
|
|
|
|
|
| |
On Cygwin, getpagesize() returns 64k(AllocationGranularity).
In r214580, the size of X86GenInstrInfo.inc became 1499136.
FIXME: We should reorganize again getPageSize() on Win32.
MapFile allocates address along AllocationGranularity but view is mapped by physical page.
llvm-svn: 214681
|
| |
|
|
|
|
|
|
|
|
| |
I spent some time looking into a better or more principled way to handle
this. For example, by detecting arbitrary "unneeded" ORs... But really,
there wasn't any point. We just shouldn't build blatantly wrong code so
late in the pipeline rather than adding more stages and logic later on
to fix it. Avoiding this is just too simple.
llvm-svn: 214680
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Fundamentally, there isn't a really portable way to test the constant
pool contents. Instead, pin this test to the bare-metal triple. This
also makes it a 64-bit triple which allows us to only match a single
constant pool rather than two. It can also just hard code the '.' prefix
as the format should be stable now that it has a fixed triple. Finally,
I've switched it to use CHECK-NEXT to be more precise in the instruction
sequence expected and to use variables rather than hard coding decisions
by the register allocator.
llvm-svn: 214679
|
| |
|
|
|
|
|
| |
poorly-worded warning for a case value that is not a possible value of the
switched-on expression.
llvm-svn: 214678
|
| |
|
|
| |
llvm-svn: 214677
|
| |
|
|
| |
llvm-svn: 214676
|
| |
|
|
|
|
| |
destroyed on shutdown regardless. Fixes a double-delete.
llvm-svn: 214675
|
| |
|
|
| |
llvm-svn: 214674
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
combines) until they are legal.
Doing it the old way could, when the stars align *just* right, cause
a node to get into the combine set prior to being legalized. Then, when
the same node showed up as an operand to another node later on (but not
so much later on that it had been deleted as dead) we would fail to add
it back to the worklist thinking it had already been combined. This
would in turn cause it to not be legalized. Fortunately, we can also
walk the operands looking for uncombined (and thus potentially
un-legalized) nodes late. It will still ensure that we walk all operands
of all nodes and send all of them through both the legalizer without
changes and the combiner at least once. (Which was the original goal of
this).
I have a test case for this bug, but it is terribly brittle. For
example, it will stop finding the bug the moment I enable the new
shuffle lowering. I don't yet have any test case that reliably exercises
this bug, and it isn't clear that it will be possible to craft one. It
is entirely possible that with the new shuffle lowering the two forms of
doing this are precisely equivalent. That doesn't mean we shouldn't take
the more conservative approach of insisting on things in the combined
set having survived the legalizer.
llvm-svn: 214673
|
| |
|
|
|
|
|
|
|
| |
GCC 4.8.2 points out the ambiguity in evaluation of the assertion condition:
lib/Target/X86/X86FloatingPoint.cpp:949:49: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
assert(STReturns == 0 || isMask_32(STReturns) && N <= 2);
llvm-svn: 214672
|
| |
|
|
|
|
|
|
| |
GCC 4.8.2 objects to the tautological condition in the assert as the unsigned
value is guaranteed to be >= 0. Simplify the assertion by dropping the
tautological condition.
llvm-svn: 214671
|
| |
|
|
|
|
|
|
|
|
| |
This is intended to be the minimal change needed to fix PR20354 ( http://llvm.org/bugs/show_bug.cgi?id=20354 ). The check for a vector operation was wrong; we need to check that the fabs itself is not a vector operation.
This patch will not generate the optimal code. A constant pool load and 'and' op will be generated instead of just returning a value that we can calculate in advance (as we do for the scalar case). I've put a 'TODO' comment for that here and expect to have that patch ready soon.
There is a very similar optimization that we can do in visitFNEG, so I've put another 'TODO' there and expect to have another patch for that too.
llvm-svn: 214670
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sequence - AArch64 target support
This patch turns off madd/msub generation in the DAGCombiner and generates
them in the MachineCombiner instead. It replaces the original code sequence
with the combined sequence when it is beneficial to do so.
When there is no machine model support it always generates the madd/msub
instruction. This is true also when the objective is to optimize for code
size: when the combined sequence is shorter is always chosen and does not
get evaluated.
When there is a machine model the combined instruction sequence
is evaluated for critical path and resource length using machine
trace metrics and the original code sequence is replaced when it is
determined to be faster.
rdar://16319955
llvm-svn: 214669
|
| |
|
|
|
|
|
| |
It's a bit more obvious what's going on if we use path::filename
rather than decrementing an iterator here.
llvm-svn: 214668
|
| |
|
|
|
|
|
|
| |
call Target::SetArchitecture instead of modifying a
reference to the target's architecture so that the
target logging can show that the arch has been changed.
llvm-svn: 214667
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sequence - target independent framework
When the DAGcombiner selects instruction sequences
it could increase the critical path or resource len.
For example, on arm64 there are multiply-accumulate instructions (madd,
msub). If e.g. the equivalent multiply-add sequence is not on the
crictial path it makes sense to select it instead of the combined,
single accumulate instruction (madd/msub). The reason is that the
conversion from add+mul to the madd could lengthen the critical path
by the latency of the multiply.
But the DAGCombiner would always combine and select the madd/msub
instruction.
This patch uses machine trace metrics to estimate critical path length
and resource length of an original instruction sequence vs a combined
instruction sequence and picks the faster code based on its estimates.
This patch only commits the target independent framework that evaluates
and selects code sequences. The machine instruction combiner is turned
off for all targets and expected to evolve over time by gradually
handling DAGCombiner pattern in the target specific code.
This framework lays the groundwork for fixing
rdar://16319955
llvm-svn: 214666
|
| |
|
|
|
|
|
|
|
|
| |
There is no needed for neither 1-dimensional nor higher dimensional arrays to
require positive offsets in the outermost array dimension.
We originally introduced this assumption with the support for delinearizing
multi-dimensional arrays.
llvm-svn: 214665
|
| |
|
|
|
|
|
|
|
| |
This makes EmitWindowsUnwindTables a virtual function and lowers the
implementation of the function to the X86WinCOFFStreamer. This method is a
target specific operation. This enables making the behaviour target dependent
by isolating it entirely to the target specific streamer.
llvm-svn: 214664
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The frame information stored in this structure is driven by the requirements for
Windows NT unwinding rather than Windows 64 specifically. As a result, this
type can be shared across multiple architectures (ARM, AXP, MIPS, PPC, SH).
Rename this class in preparation for adding support for supporting unwinding
information for Windows on ARM.
Take the opportunity to constify the members as everything except the
ChainedParent is read-only. This required some adjustment to the label
handling.
llvm-svn: 214663
|
| |
|
|
|
|
| |
type handling.
llvm-svn: 214662
|
| |
|
|
|
|
|
|
|
| |
This slipped in in r214467, so something like
V_MOV_B32_e32 v0, ... is now printed with 2 spaces
between the instruction name and first operand.
llvm-svn: 214660
|
| |
|
|
|
|
|
|
|
|
|
|
| |
+ Remove the class IslGenerator which duplicates the functionality of
IslExprBuilder.
+ Use the IslExprBuilder to create code for memory access relations.
+ Also handle array types during access creation.
+ Enable scev codegen for one of the transformed memory access tests,
thus access creation without canonical induction variables available.
+ Update one test case to the new output.
llvm-svn: 214659
|
| |
|
|
| |
llvm-svn: 214658
|
| |
|
|
|
|
|
|
| |
The updated tests use a different context than the old ones did.
Other than that only their path and the code generation we use
changed.
llvm-svn: 214657
|
| |
|
|
|
|
| |
mingw32 builder.
llvm-svn: 214656
|
| |
|
|
|
|
|
|
|
| |
When we have a covered lookup table, make sure we don't delete PHINodes that
are cached in PHIs.
rdar://17887153
llvm-svn: 214642
|
| |
|
|
|
|
| |
target independent.
llvm-svn: 214641
|
| |
|
|
| |
llvm-svn: 214640
|
| |
|
|
| |
llvm-svn: 214639
|
| |
|
|
| |
llvm-svn: 214638
|
| |
|
|
| |
llvm-svn: 214637
|
| |
|
|
|
|
|
|
|
| |
when let can do the same thing. Keep the 64bit variants as codegen-only.
While they have a different register class, the encoding is the same for
32bit and 64bit mode. Having both present would otherwise confuse the
disassembler.
llvm-svn: 214636
|
| |
|
|
| |
llvm-svn: 214635
|
| |
|
|
|
|
|
|
| |
vector load is not a good transform when paired loads are available.
The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers!
llvm-svn: 214634
|
| |
|
|
|
|
|
| |
This area of code is currently not very much tested. It will hopefully be
superseeded by Yabin's GSoC project.
llvm-svn: 214633
|
| |
|
|
| |
llvm-svn: 214632
|
| |
|
|
|
|
| |
forget to update the comment here... =/
llvm-svn: 214630
|
| |
|
|
|
|
|
| |
Darwin x86 asm comment prefix designed to work around GAS on that
platform. That makes the comment-matching of the test much more stable.
llvm-svn: 214629
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lowering with a small addition to it and adding PSHUFB combining.
There is one obvious place in the new vector shuffle lowering where we
should form PSHUFBs directly: when without them we will unpack a vector
of i8s across two different registers and do a potentially 4-way blend
as i16s only to re-pack them into i8s afterward. This is the crazy
expensive fallback path for i8 shuffles and we can just directly use
pshufb here as it will always be cheaper (the unpack and pack are
two instructions so even a single shuffle between them hits our
three instruction limit for forming PSHUFB).
However, this doesn't generate very good code in many cases, and it
leaves a bunch of common patterns not using PSHUFB. So this patch also
adds support for extracting a shuffle mask from PSHUFB in the X86
lowering code, and uses it to handle PSHUFBs in the recursive shuffle
combining. This allows us to combine through them, combine multiple ones
together, and generally produce sufficiently high quality code.
Extracting the PSHUFB mask is annoyingly complex because it could be
either pre-legalization or post-legalization. At least this doesn't have
to deal with re-materialized constants. =] I've added decode routines to
handle the different patterns that show up at this level and we dispatch
through them as appropriate.
The two primary test cases are updated. For the v16 test case there is
still a lot of room for improvement. Since I was going through it
systematically I left behind a bunch of FIXME lines that I'm hoping to
turn into ALL lines by the end of this.
llvm-svn: 214628
|
| |
|
|
|
|
|
| |
Spotted this missed refactoring by inspection when reading code, and it
doesn't changethe functionality at all.
llvm-svn: 214627
|
| |
|
|
| |
llvm-svn: 214626
|