| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
dynamic blends.
This makes it much more clear what is going on. The case we're handling
is that of dynamic conditions, and we're bailing when the nature of the
vector types and subtarget preclude lowering the dynamic condition
vselect as an actual blend.
No functionality changed here, but this will make a subsequent bug-fix
to this code much more clear.
llvm-svn: 230690
|
| |
|
|
|
|
|
| |
change functionality, but makes it more clear that the dynamic case and
the shuffle case don't overlap in any interesting way.
llvm-svn: 230689
|
| |
|
|
|
|
| |
AVX2.
llvm-svn: 230688
|
| |
|
|
| |
llvm-svn: 230685
|
| |
|
|
|
|
| |
clang-cl -Wtautological
llvm-svn: 230684
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Creating BinaryCoverageReader is a strange and complicated dance where
the constructor sets error codes that member functions will later
read, and the object is in an invalid state if readHeader isn't
immediately called after construction.
Instead, make the constructor private and add a static create method
to do the construction properly. This also has the benefit of removing
readHeader completely and simplifying the interface of the object.
llvm-svn: 230676
|
| |
|
|
|
|
|
| |
The current name is long and confusing. A shorter one is both easier
to understand and easier to work with.
llvm-svn: 230675
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is not sound to mark the increment operation as `nuw` or `nsw`
based on a proof off of the add recurrence if the increment operation
we emit happens to be a `sub` instruction.
I could not come up with a test case for this -- the cases where
SCEVExpander decides to emit a `sub` instruction is quite small, and I
cannot think of a way I'd be able to get SCEV to prove that the
increment does not overflow in those cases.
Differential Revision: http://reviews.llvm.org/D7899
llvm-svn: 230673
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On 32bits x86 Darwin, the register mappings for the eh_frane and
debug_frame sections are different. Thus the same CFI instructions
should result in different registers in the object file. The
problem isn't target specific though, but it requires that the
mappings for EH register numbers be different from the standard
Dwarf one.
The patch looks a bit clumsy. LLVM uses the EH mapping as
canonical for everything frame related. Thus we need to do a
double conversion EH -> LLVM -> Non-EH, when emitting the
debug_frame section.
Fixes PR22363.
Differential Revision: http://reviews.llvm.org/D7593
llvm-svn: 230670
|
| |
|
|
|
|
|
|
| |
The shadow stack space expectations won't match.
Fixes PR22709.
llvm-svn: 230667
|
| |
|
|
|
|
|
|
|
|
| |
loads/stores
InstCombine has long had logic to convert aligned Altivec load/store intrinsics
into regular loads and stores. This mirrors that functionality for QPX vector
load/store intrinsics.
llvm-svn: 230660
|
| |
|
|
|
|
|
|
|
|
|
|
| |
have the debugger step through each one individually. Turn off the
combine for adjacent stores at -O0 so we get this behavior.
Possibly, DAGCombine shouldn't run at all at -O0, but that's for
another day; see PR22346.
Differential Revision: http://reviews.llvm.org/D7181
llvm-svn: 230659
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a problem when passing structures as variable arguments.
The structures smaller than 64 bit were not left justified on MIPS64
big endian. This is now fixed by shifting the value to make it left-
justified when appropriate.
This fixes the bug http://llvm.org/bugs/show_bug.cgi?id=21608
Patch by Aleksandar Beserminji.
Differential Revision: http://reviews.llvm.org/D7881
llvm-svn: 230657
|
| |
|
|
|
|
|
|
|
|
|
| |
In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the
features bits are not computed. This patch lets the asm printer
emit ".cpu cortex-a9" directive for krait and the hwdiv feature is
enabled through ".arch_extension". In short, krait is treated
as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since
it is not supported bu GNU GAS yet
llvm-svn: 230651
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch is in response to r223147 where the avaiable features are
computed based on ".cpu" directive. This will work clean for the standard
variants like cortex-a9. For custom variants which rely on standard cpu names
for assembly, the additional features of a CPU should be propagated. This can be
done via ".arch_extension" as long as the assembler supports it. The
implementation for krait along with unit test will be submitted in next patch.
llvm-svn: 230650
|
| |
|
|
|
|
|
|
|
|
|
| |
accesses are via different types
Noticed this while generalizing the code for loop distribution.
I confirmed with Arnold that this was indeed a bug and managed to create
a testcase.
llvm-svn: 230647
|
| |
|
|
|
|
| |
This matches the assembly syntax for the proprietary compiler.
llvm-svn: 230645
|
| |
|
|
|
|
|
| |
The latency for the WriteMULm class was set to 4, which is actually lower than the latency for WriteMULr (5).
A better estimate would be 4 added to WriteMULr, that is, 9.
llvm-svn: 230634
|
| |
|
|
|
|
|
|
|
|
| |
formulaic into the top v8i16 lowering routine.
This makes the generalized lowering a completely general and single path
lowering which will allow generalizing it in turn for multiple 128-bit
lanes.
llvm-svn: 230623
|
| |
|
|
|
|
| |
MVT.
llvm-svn: 230622
|
| |
|
|
|
|
| |
backedge-taken count in profiliing data.
llvm-svn: 230619
|
| |
|
|
|
|
|
|
|
| |
IRCE can now split the iteration space for loops like:
for (i = n; i >= 0; i--)
a[i + k] = 42; // bounds check on access
llvm-svn: 230618
|
| |
|
|
|
|
| |
This ordering matches that of DAG.getNode.
llvm-svn: 230617
|
| |
|
|
|
|
|
|
| |
Also remove the somewhat misleading initializers from
VectorizationFactor and VectorizationInterleave. They will get
initialized with the default ctor since no cl::init is provided.
llvm-svn: 230608
|
| |
|
|
| |
llvm-svn: 230607
|
| |
|
|
|
|
|
| |
It still prints "Assembling path/to/X86CompilationCallback_Win64.asm",
but linking does the same thing.
llvm-svn: 230596
|
| |
|
|
|
|
|
|
|
|
| |
Use the IRBuilder helpers for gc.statepoint and gc.result, instead of
coding the construction by hand. Note that the gc.statepoint IRBuilder
handles only CallInst, not InvokeInst; retain that part of hand-coding.
Differential Revision: http://reviews.llvm.org/D7518
llvm-svn: 230591
|
| |
|
|
|
|
|
|
|
|
| |
Explanation: This function is in TargetLowering because it uses
RegClassForVT which would need to be moved to TargetRegisterInfo
and would necessitate moving isTypeLegal over as well - a massive
change that would just require TargetLowering having a TargetRegisterInfo
class member that it would use.
llvm-svn: 230585
|
| |
|
|
|
|
|
|
|
| |
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.
llvm-svn: 230583
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D7644
llvm-svn: 230582
|
| |
|
|
|
|
| |
This particular subtype of Mach-O was missing. Add it.
llvm-svn: 230567
|
| |
|
|
|
|
|
|
|
| |
This symbol exists only to pull in the required pieces of the runtime,
so nothing ever needs to refer to it. Making it hidden avoids the
potential for issues with duplicate symbols when linking profiled
libraries together.
llvm-svn: 230566
|
| |
|
|
|
|
|
|
|
|
|
| |
Remove a newline from `AssemblyWriter::printMDNodeBody()`, and add one
to `AssemblyWriter::writeMDNode()`. NFCI for assembly output.
However, this drops an inconsistent newline from `Metadata::print()`
when `this` is an `MDNode`. Now the newline added by `Metadata::dump()`
won't look so verbose.
llvm-svn: 230565
|
| |
|
|
|
|
|
|
|
|
|
|
| |
non-zero
This is a follow-on to r227491 which tightens the check for propagating FP
values. If a non-constant value happens to be a zero, we would hit the same
bug as before.
Bug noted and patch suggested by Eli Friedman.
llvm-svn: 230564
|
| |
|
|
|
|
|
| |
Remove a default argument that's never passed and a constructor that's
never called.
llvm-svn: 230563
|
| |
|
|
|
|
|
|
|
| |
the .h file. It's used in only one place (other than recursively)
and there's no need to include it everywhere.
Saves almost 900k from total llvm object file size.
llvm-svn: 230561
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
punning
Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is.
Test Plan: make check
Reviewers: jvoung, chandlerc
Subscribers: llvm-commits
llvm-svn: 230560
|
| |
|
|
| |
llvm-svn: 230559
|
| |
|
|
|
|
|
|
|
| |
It turns out we have a macro to ensure that debuggers can access
`dump()` methods. Use it. Hopefully this will prevent me (and others)
from committing crimes like in r223802 (search for /10000/, or just see
the fix in r224407).
llvm-svn: 230555
|
| |
|
|
|
|
| |
It seems ArrayRefs to multi-dimensional arrays confuse some compilers.
llvm-svn: 230554
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node,
represent loads from the TOC, which is invariant. As a result, these loads can
be hoisted out of loops, etc. In order to do this, we need to generate
GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory
intrinsic node type. Once this is done, the MMO transfer is automatically
handled for TableGen-driven instruction selection, and for nodes generated
directly in PPCISelDAGToDAG, we need to transfer the MMOs manually.
Also, we were not transferring MMOs associated with pre-increment loads, so do
that too.
Lastly, this fixes an exposed bug where R30 was not added as a defined operand of
UpdateGBR.
This problem was highlighted by an example (used to generate the test case)
posted to llvmdev by Francois Pichet.
llvm-svn: 230553
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for pretty-printing instruction operands. The new
output looks like:
00000000 00000010 ffffffff CIE
Version: 1
Augmentation:
Code alignment factor: 1
Data alignment factor: -4
Return address column: 8
DW_CFA_def_cfa: reg4 +4
DW_CFA_offset: reg8 -4
DW_CFA_nop:
DW_CFA_nop:
00000014 00000010 00000000 FDE cie=00000000 pc=00000000...00000022
DW_CFA_advance_loc: 3
DW_CFA_def_cfa_offset: +12
DW_CFA_nop:
llvm-svn: 230551
|
| |
|
|
|
|
|
| |
CIE pointers were never filled in before, and printing the pointer
is totally pointless anyway.
llvm-svn: 230550
|
| |
|
|
|
|
|
| |
Move the FrameEntry::dumpInstructions down in the file at some
place where it can see the declarations of FDE and CIE.
llvm-svn: 230549
|
| |
|
|
|
|
| |
To be used for dumping.
llvm-svn: 230548
|
| |
|
|
|
|
|
|
| |
This is the first commit in a small series aiming at making
debug_frame dump more useful (right now it prints a list of
opeartions without their operands).
llvm-svn: 230547
|
| |
|
|
|
|
|
|
| |
r230290 released the LLVM module but not the LTOModule.
rdar://19024554
llvm-svn: 230544
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Win64 epilogue structure is very restrictive, it permits a very
small number of opcodes and none of them are 'mov'.
This means that given:
mov %rbp, %rsp
pop %rbp
The mov isn't the epilogue, only the pop is. This is problematic unless
a frame pointer is present in which case we are free to do whatever we'd
like in the "body" of the function. If a frame pointer is present,
unwinding will undo the prologue operations in reverse order regardless
of the fact that we are at an instruction which is reseting the stack
pointer.
llvm-svn: 230543
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change aligns globals to the next highest power of 2 bytes, up to a
maximum of 128. This makes it more likely that we will be able to compress
bit sets with a greater alignment. In many more cases, we can now take
advantage of a new optimization also introduced in this patch that removes
bit set checks if the bit set is all ones.
The 128 byte maximum was found to provide the best tradeoff between instruction
overhead and data overhead in a recent build of Chromium. It allows us to
remove ~2.4MB of instructions at the cost of ~250KB of data.
Differential Revision: http://reviews.llvm.org/D7873
llvm-svn: 230540
|
| |
|
|
| |
llvm-svn: 230535
|