| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The series of patches leading up to this one makes llc -O0 run 8% faster.
When deallocating a MachineFunction, there is no need to visit all
MachineInstr and MachineOperand objects to deallocate them. All their
memory come from a BumpPtrAllocator that is about to be purged, and they
have empty destructors anyway.
This only applies when deallocating the MachineFunction.
DeleteMachineInstr() should still be used to recycle MI memory during
the codegen passes.
Remove the LeakDetector support for MachineInstr. I've never seen it
used before, and now it definitely doesn't work. With this patch, leaked
MachineInstrs would be much less of a problem since all of their memory
will be reclaimed by ~MachineFunction().
llvm-svn: 171599
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of an std::vector<MachineOperand>, use MachineOperand arrays
from an ArrayRecycler living in MachineFunction.
This has several advantages:
- MachineInstr now has a trivial destructor, making it possible to
delete them in batches when destroying MachineFunction. This will be
enabled in a later patch.
- Bypassing malloc() and free() can be faster, depending on the system
library.
- MachineInstr objects and their operands are allocated from the same
BumpPtrAllocator, so they will usually be next to each other in
memory, providing better locality of reference.
- Reduce MachineInstr footprint. A std::vector is 24 bytes, the new
operand array representation only uses 8+4+1 bytes in MachineInstr.
- Better control over operand array reallocations. In the old
representation, the use-def chains would be reordered whenever a
std::vector reached its capacity. The new implementation never changes
the use-def chain order.
Note that some decisions in the code generator depend on the use-def
chain orders, so this patch may cause different assembly to be produced
in a few cases.
llvm-svn: 171598
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This function works like memmove() for MachineOperands, except it also
updates any use-def chains containing the moved operands.
The use-def chains are updated without affecting the order of operands
in the list. That isn't possible when using the
removeRegOperandFromUseList() and addRegOperandToUseList() functions.
Callers to follow soon.
llvm-svn: 171597
|
| |
|
|
|
|
|
|
|
|
|
|
| |
legality of an address mode to not use a struct of four values and
instead to accept them as parameters. I'd love to have named parameters
here as most callers only care about one or two of these, but the
defaults aren't terribly scary to write out.
That said, there is no real impact of this as the passes aren't yet
using STTI for this and are still relying upon TargetLowering.
llvm-svn: 171595
|
| |
|
|
|
|
|
|
|
|
|
|
| |
next to its only user. This helper relies on TargetLowering information
that shouldn't be generally used throughout the Transfoms library, and
so it made little sense as a generic utility.
This also consolidates the file where we need to remove the remaining
uses of TargetLowering in favor of the IR-layer abstract interface in
TargetTransformInfo.
llvm-svn: 171590
|
| |
|
|
|
|
| |
and add stack alignment information.
llvm-svn: 171587
|
| |
|
|
|
|
|
|
| |
The Attribute class is eventually going to represent one attribute. So we need
this class to create the set of attributes. Add some iterator methods to the
builder to access its internal bits in a nice way.
llvm-svn: 171586
|
| |
|
|
|
|
|
|
| |
as long as the reduction chain is used in the LHS.
PR14803.
llvm-svn: 171583
|
| |
|
|
|
|
|
|
|
|
|
| |
leaving this undefined, and despite the sentence in the standard that
seems to require it, I'll cede the point and assume its a bug in the
wording. Other parts of POSIX regularly allow for things to be -1
instead of undefined, this should too. Makes things more consistent too.
This should have to real impact for folks though.
llvm-svn: 171574
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
defines _POSIX_CPUTIME but doesn't support the clock_* functions.
I don't test the value of _POSIX_CPUTIME because the spec merely says
that if it is defined, the CPU-specific timers are available, whereas it
says that _POSIX_TIMERS must be defined and defined to a value greater
than zero. However, this may not work, as the POSIX spec clearly states:
"If the symbolic constant _POSIX_CPUTIME is defined, then the symbolic
constant _POSIX_TIMERS shall also be defined by the implementation to
have the value 200112L."
If this doesn't work, I'll add more hacks for Darwin.
llvm-svn: 171565
|
| |
|
|
| |
llvm-svn: 171559
|
| |
|
|
|
|
|
|
| |
The bit mask thing will be a thing of the past. It's not extensible enough. Get
rid of its use here. Opt instead for using a vector to hold the attributes.
Note: Some of this code will become obsolete once the rewrite is further along.
llvm-svn: 171553
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wall time, user time, and system time since a process started.
For walltime, we currently use TimeValue's interface and a global
initializer to compute a close approximation of total process runtime.
For user time, this adds support for an somewhat more precise timing
mechanism -- clock_gettime with the CLOCK_PROCESS_CPUTIME_ID clock
selected.
For system time, we have to do a full getrusage call to extract the
system time from the OS. This is expensive but unavoidable.
In passing, clean up the implementation of the old APIs and fix some
latent bugs in the Windows code. This might have manifested on Windows
ARM systems or other systems with strange 64-bit integer behavior.
The old API for this both user time and system time simultaneously from
a single getrusage call. While this results in fewer system calls, it
also results in a lower precision user time and if only user time is
desired, it introduces a higher overhead. It may be worthwhile to switch
some of the pass timers to not track system time and directly track user
and wall time. The old API also tracked walltime in a confusing way --
it just set it to the current walltime rather than providing any measure
of wall time since the process started the way buth user and system time
are tracked. The new API is more consistent here.
The plan is to eventually implement these methods for a *child* process
by using the wait3(2) system call to populate an rusage struct
representing the whole subprocess execution. That way, after waiting on
a child process its stats will become accurate and cheap to query.
llvm-svn: 171551
|
| |
|
|
|
|
| |
because conditions in the next case prevented from doing anything nasty.
llvm-svn: 171549
|
| |
|
|
|
|
| |
The R600 target has test cases that exercises this code.
llvm-svn: 171538
|
| |
|
|
|
|
|
|
|
| |
Since subtraction does not commute the loop vectorizer incorrectly vectorizes
reductions such as x = A[i] - x.
Disabling for now.
llvm-svn: 171537
|
| |
|
|
|
|
|
|
| |
types and a FIXME for what we should be doing. Should solve the
immediacy of PR12069 where our debug info is crashing another
tool.
llvm-svn: 171536
|
| |
|
|
|
|
| |
objc_retainAutorelasedReturnValue.
llvm-svn: 171535
|
| |
|
|
|
|
|
|
| |
the method where it was being called when I should have just prefixed the actual message with Pass::Method.
Additionally I fixed some whitespace issues.
llvm-svn: 171534
|
| |
|
|
| |
llvm-svn: 171525
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.
When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.
It includes tests.
Patch by Andy Zhang.
llvm-svn: 171524
|
| |
|
|
|
|
|
|
|
|
| |
* Remove dead methods.
* Use the 'operator==' method instead of 'contains', which isn't needed.
* Fix some comments.
No functionality change.
llvm-svn: 171523
|
| |
|
|
|
|
| |
to create a properly aligned reader.
llvm-svn: 171520
|
| |
|
|
|
|
| |
vectors are being compared.
llvm-svn: 171517
|
| |
|
|
| |
llvm-svn: 171515
|
| |
|
|
|
|
|
| |
Update test case to verify flow sequence is
written as a flow sequence.
llvm-svn: 171514
|
| |
|
|
|
|
| |
shift_rotate_imm64.
llvm-svn: 171513
|
| |
|
|
|
|
|
|
|
|
|
|
| |
reachablity.
We conservatively approximate the reachability analysis by saying it is not
reachable if there is a single path starting from "From" and the path does not
reach "To".
rdar://12801584
llvm-svn: 171512
|
| |
|
|
| |
llvm-svn: 171511
|
| |
|
|
| |
llvm-svn: 171510
|
| |
|
|
| |
llvm-svn: 171507
|
| |
|
|
|
|
|
|
|
| |
This patch fixes the PPC eh_frame definitions for the personality and
frame unwinding for PIC objects. It makes PIC build correctly creates
relative relocations in the '.rela.eh_frame' segments and thus avoiding
a text relocation that generates a DT_TEXTREL segments in link phase.
llvm-svn: 171506
|
| |
|
|
|
|
| |
dont have this hook.
llvm-svn: 171489
|
| |
|
|
| |
llvm-svn: 171487
|
| |
|
|
| |
llvm-svn: 171475
|
| |
|
|
|
|
| |
string offset section.
llvm-svn: 171474
|
| |
|
|
|
|
| |
size actually hurts the performance on many programs.
llvm-svn: 171471
|
| |
|
|
|
|
|
|
| |
1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.
llvm-svn: 171469
|
| |
|
|
|
|
|
|
|
| |
tests fail. Original message:
Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1.
Added a test.
llvm-svn: 171468
|
| |
|
|
|
|
|
|
| |
SETCC result is 0 or -1.
Added a test.
llvm-svn: 171467
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks."
This reverts commit r171461 since it breaks the following tests:
Clang :: Analysis/outofbound-notwork.c
Clang :: Analysis/string-fail.c
Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp
Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp
Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp
Clang :: CXX/temp/temp.param/p14.cpp
Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp
Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c
Clang :: CodeGen/blocks-2.c
Clang :: CodeGen/libcalls-d.c
Clang :: CodeGen/libcalls-ld.c
Clang :: CodeGenCXX/conversion-function.cpp
Clang :: CodeGenCXX/debug-info-limit-type.cpp
Clang :: CodeGenCXX/inheriting-constructor.cpp
Clang :: FixIt/fixit-errors.c
Clang :: FixIt/fixit-pmem.cpp
Clang :: Modules/namespaces.cpp
Clang :: PCH/changed-files.c
Clang :: PCH/pr4489.c
Clang :: PCH/source-manager-stack.c
Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp
Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp
Clang :: SemaTemplate/instantiate-function-1.mm
llvm-svn: 171466
|
| |
|
|
|
|
| |
processed when said queue was really a list to state a list had finished being processed.
llvm-svn: 171465
|
| |
|
|
|
|
| |
ObjCARCAPElim::OptimizeBB.
llvm-svn: 171464
|
| |
|
|
|
|
| |
*p = null.
llvm-svn: 171463
|
| |
|
|
|
|
| |
architectures where this is required to perform a retainAutoreleasedReturnValue optimization.
llvm-svn: 171462
|
| |
|
|
|
|
| |
dividing by 0. This is needed to keep early if conversion from moving them across basic blocks.
llvm-svn: 171461
|
| |
|
|
|
|
|
|
|
| |
In order to cost subvector insertion and extraction, we need to know
the type of the subvector being extracted.
No functionality change.
llvm-svn: 171453
|
| |
|
|
| |
llvm-svn: 171450
|
| |
|
|
|
|
|
|
|
|
|
| |
before the last time.
--- Reverse-merging r171442 into '.':
U include/llvm/IR/Attributes.h
U lib/IR/Attributes.cpp
U lib/IR/AttributeImpl.h
llvm-svn: 171448
|
| |
|
|
|
|
|
|
| |
--- Reverse-merging r171441 into '.':
U include/llvm/IR/Attributes.h
U lib/IR/Attributes.cpp
llvm-svn: 171444
|