| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
v2:
- Fix LDS size calculation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 193621
|
|
|
|
| |
llvm-svn: 193620
|
|
|
|
|
|
| |
Based on D2050 by Timur Iskhodzhanov.
llvm-svn: 193619
|
|
|
|
| |
llvm-svn: 193618
|
|
|
|
| |
llvm-svn: 193617
|
|
|
|
| |
llvm-svn: 193616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MSVC can't comprehend
template<typename T, size_t N>
ArrayRef<T> makeArrayRef(const T (&Arr)[N]) {
return ArrayRef<T>(Arr);
}
if Arr is
static const uint8_t sizes[];
declared in a templated and defined a few lines later.
I'll send a proper fix (i.e. get rid of unnecessary templates) for review soon.
llvm-svn: 193604
|
|
|
|
|
|
|
|
| |
Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend.
Differential Revision: http://llvm-reviews.chandlerc.com/D2036
llvm-svn: 193599
|
|
|
|
|
|
| |
Sorry Peter Zotov, entirely my fault.
llvm-svn: 193598
|
|
|
|
|
|
|
|
| |
Patch by Peter Zotov
Differential Revision: http://llvm-reviews.chandlerc.com/D1910
llvm-svn: 193597
|
|
|
|
|
|
|
| |
This is used in the Linux kernel, and effectively just means "print an
address".
llvm-svn: 193593
|
|
|
|
|
|
|
|
|
|
|
|
| |
after the DIE creation, we construct the context first.
This touches creation of namespaces and global variables. The purpose is to
handle all DIE creations similarly: constructs the context first, then creates
the DIE and immediately adds the DIE to its parent.
We use createAndAddDIE to wrap around "new DIE(".
llvm-svn: 193589
|
|
|
|
| |
llvm-svn: 193579
|
|
|
|
| |
llvm-svn: 193576
|
|
|
|
|
|
| |
error: conversion from `const uint8_t*' to non-scalar type `llvm::ArrayRef<unsigned char>' requested
llvm-svn: 193575
|
|
|
|
|
|
|
|
|
| |
Updated a test case that assumed that <2 x double> would vectorize to use
<4 x float>.
radar://15338229
llvm-svn: 193574
|
|
|
|
|
|
|
|
|
| |
By vectorizing a series of srl, or, ... instructions we have obfuscated the
intention so much that the backend does not know how to fold this code away.
radar://15336950
llvm-svn: 193573
|
|
|
|
|
|
|
|
|
| |
No test case, because with the current cost model we don't see a difference.
An upcoming ARM memory cost model change will expose and test this bug.
radar://15332579
llvm-svn: 193572
|
|
|
|
|
|
|
| |
ELF. They can overlap with the other symbols, e.g. if a source file
"foo.c" contains a function "foo" with a static variable "c".
llvm-svn: 193569
|
|
|
|
|
|
|
|
|
| |
This commit ensures DIEs are constructed within a compile unit and
immediately added to their parents.
Reviewed off-list by Eric.
llvm-svn: 193568
|
|
|
|
|
|
|
|
|
|
|
| |
More patches will be submitted to convert "new DIE(" to use createAddAndDIE in
DwarfCompileUnit.cpp. This will simplify implementation of addDIEEntry where
we have to decide between ref4 and ref_addr, because DIEs that can be shared
across CU will be added to a CU already.
Reviewed off-list by Eric.
llvm-svn: 193567
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It wraps around "new DIE(" and handles the bookkeeping part of the newly-created
DIE. It adds the DIE to its parent, and calls insertDIE if necessary. It makes
sure that bookkeeping is done at the earliest time and we should not see
parentless DIEs if all constructions of DIEs go through this helper function.
Later on, we can use an allocator for DIE allocation, and will only need to
change createAndAddDIE instead of modifying all the "new DIE(".
Reviewed off-list by Eric.
llvm-svn: 193566
|
|
|
|
|
|
|
| |
Complicated CU-DIE-specific logic in the latter was never used,
and it makes sense to have safety checks for broken dwarf in the former.
llvm-svn: 193563
|
|
|
|
|
|
| |
DWARFDIE::extractFast() interface. No functionality change.
llvm-svn: 193560
|
|
|
|
|
|
| |
function size
llvm-svn: 193555
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Use DWARF4 table of form classes to fetch attributes from DIE
in a more consistent way. This shouldn't change the functionality and
serves as a refactoring for upcoming change: DW_AT_high_pc has different
semantics depending on its form class.
Reviewers: dblaikie, echristo
Reviewed By: echristo
CC: echristo, llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1961
llvm-svn: 193553
|
|
|
|
|
|
| |
No functionality change.
llvm-svn: 193540
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
an MCExpr, in order to avoid writing an encoded zero value in the immediate
field.
When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we
don't know what the final immediate field value should be. We shouldn't
explicitly set the immediate field to an encoded zero value as zero is encoded
with a non-zero bit pattern. This leads to bits being set that pollute the
final immediate value. The nature of the encoding is such that the polluted
bits only affect very large immediate values, explaining why this hasn't
caused problems earlier.
Fixes <rdar://problem/15155975>.
llvm-svn: 193535
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit allows the ARM integrated assembler to parse
and assemble the code with .eabi_attribute, .cpu, and
.fpu directives.
To implement the feature, this commit moves the code from
AttrEmitter to ARMTargetStreamers, and several new test
cases related to cortex-m4, cortex-r5, and cortex-a15 are
added.
Besides, this commit also change the Subtarget->isFPOnlySP()
to Subtarget->hasD16() to match the usage of .fpu directive.
This commit changes the test cases:
* Several .eabi_attribute directives in
2010-09-29-mc-asm-header-test.ll are removed because the .fpu
directive already cover the functionality.
* In the Cortex-A15 test case, the value for
Tag_Advanced_SIMD_arch has be changed from 1 to 2,
which is more precise.
llvm-svn: 193524
|
|
|
|
| |
llvm-svn: 193523
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
useAA significantly improves the handling of vector code that has TBAA
information attached. It also helps other cases, as shown by the testsuite
changes here. The only real downside I've seen is that it interferes with
MergeConsecutiveStores. The problem is that that optimization works top
down, starting at the first store in the chain, and looks for cases where
the chain result is only used by a single related store. These related
stores don't alias, so useAA will have rewritten all the later stores to
use a different chain input (typically the same one as the first store).
I think the advantages outweigh the disadvantages though, so for now I've
just disabled alias analysis for the unaligned-01.ll test.
llvm-svn: 193521
|
|
|
|
|
|
|
|
| |
Making useAA() default to true for SystemZ showed that the combiner alias
analysis wasn't handling volatile accesses. This hit many of the SystemZ
tests, but I arbitrarily picked one for the purpose of this patch.
llvm-svn: 193518
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most SelectionDAG code drops the TBAA info when creating a new form of a
load and store (e.g. during legalization, or when converting a plain
load to an extending one). This patch tries to catch all cases where
the TBAA information can legitimately be carried over.
The patch adds alternative forms of getLoad() and getExtLoad() that take
a MachineMemOperand instead of individual fields. (The corresponding
getTruncStore() already exists.) The idea is to use the MachineMemOperand
forms when all fields are carried over (size, pointer info, isVolatile,
isNonTemporal, alignment and TBAA info). If some adjustment is being
made, e.g. to narrow the load, then we still pass the individual fields
but also pass the TBAA info.
llvm-svn: 193517
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
positive.
We can't do this for the general case as saying a GEP with a negative index
doesn't have unsigned wrap isn't valid for negative indices.
%gep = getelementptr inbounds i32* %p, i64 -1
But an inbounds GEP cannot run past the end of address space. So we check for
the very common case of a positive index and make GEPs derived from that NUW.
Together with Andy's recent non-unit stride work this lets us analyze loops
like
void foo3(int *a, int *b) {
for (; a < b; a++) {}
}
PR12375, PR12376.
Differential Revision: http://llvm-reviews.chandlerc.com/D2033
llvm-svn: 193514
|
|
|
|
| |
llvm-svn: 193512
|
|
|
|
| |
llvm-svn: 193511
|
|
|
|
| |
llvm-svn: 193510
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before I just ported the shell of the pass. I've tried to keep everything
nearly identical to the ARM version. I think it will be very easy to eventually
merge these two and create a new more general pass that other targets can
use. I have some improvements I would like to make to allow pools to
be shared across functions and some other things. When I'm all done we
can think about making a more general pass. More to be ported but the
basic mechanism works now almost as good as gcc mips16.
llvm-svn: 193509
|
|
|
|
| |
llvm-svn: 193500
|
|
|
|
| |
llvm-svn: 193499
|
|
|
|
|
|
| |
Patch by Cameron McInally <cameron.mcinally@nyu.edu>
llvm-svn: 193497
|
|
|
|
|
|
| |
indirect memops.
llvm-svn: 193489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements quick look-up for block in loop by maintaining a hash set for blocks.
It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng).
Below are the compilation time for our benchmark in llc before & after the patch.
Benchmark llc - trunk llc - patched
401.bzip2 0.339081 100.00% 0.329657 102.86%
403.gcc 19.853966 100.00% 19.605466 101.27%
429.mcf 0.049823 100.00% 0.048451 102.83%
433.milc 0.514898 100.00% 0.510217 100.92%
444.namd 1.109328 100.00% 1.103481 100.53%
445.gobmk 4.988028 100.00% 4.929114 101.20%
456.hmmer 0.843871 100.00% 0.825865 102.18%
458.sjeng 0.754238 100.00% 0.714095 105.62%
464.h264ref 2.9668 100.00% 2.90612 102.09%
471.omnetpp 4.556533 100.00% 4.511886 100.99%
bitmnp01 0.038168 100.00% 0.0357 106.91%
idctrn01 0.037745 100.00% 0.037332 101.11%
libquake2 3.78689 100.00% 3.76209 100.66%
libquake_ 2.251525 100.00% 2.234104 100.78%
linpack 0.033159 100.00% 0.032788 101.13%
matrix01 0.045319 100.00% 0.043497 104.19%
nbench 0.333161 100.00% 0.329799 101.02%
tblook01 0.017863 100.00% 0.017666 101.12%
ttsprk01 0.054337 100.00% 0.053057 102.41%
Reviewer : Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov>
Approver : Andrew Trick <atrick@apple.com>
Test : Pass make check-all & llvm test-suite
llvm-svn: 193460
|
|
|
|
|
|
|
|
|
|
|
|
| |
Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu
(affecting trunk and 3.3)
When SCEV expands a recurrence outside of a loop it attempts to scale
by the stride of the recurrence. Chained recurrences don't work that
way. We could compute binomial coefficients, but would hve to
guarantee that the chained AddRec's are in a perfectly reduced form.
llvm-svn: 193438
|
|
|
|
|
|
|
|
|
|
| |
Partial fix for PR17459: wrong code at -O3 on x86_64-linux-gnu
(affecting trunk and 3.3)
ScalarEvolutionNormalization was attempting to normalize by adding and
subtracting strides. Chained recurrences don't work that way.
llvm-svn: 193437
|
|
|
|
|
|
|
|
|
|
|
| |
This patch teaches GlobalStatus to analyze a call that uses the global value as
a callee, not as an argument.
With this change internalize call handle the common use of linkonce_odr
functions. This reduces the number of linkonce_odr functions in a LTO build of
clang (checked with the emit-llvm gold plugin option) from 1730 to 60.
llvm-svn: 193436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The loop vectorizer does not currently understand how to vectorize
extractelement instructions. The existing check, which excluded all
vector-valued instructions, did not catch extractelement instructions because
it checked only the return value. As a result, vectorization would proceed,
producing illegal instructions like this:
%58 = extractelement <2 x i32> %15, i32 0
%59 = extractelement i32 %58, i32 0
where the second extractelement is illegal because its first operand is not a vector.
llvm-svn: 193434
|
|
|
|
| |
llvm-svn: 193432
|
|
|
|
| |
llvm-svn: 193429
|
|
|
|
| |
llvm-svn: 193427
|