|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
...  when the offset is not statically known.
Prioritize addresses relative to the stack pointer in the stackmap, but
fallback gracefully to other modes of addressing if the offset to the
stack pointer is not a known constant.
Patch by Oscar Blumberg!
Reviewers: sanjoy
Subscribers: llvm-commits, majnemer, rnk, sanjoy, thanm
Differential Revision: http://reviews.llvm.org/D21259
llvm-svn: 272756 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Document the new parameter and threshod computation
model.  Also fix a bug when the threshold parameter
is set to be different from the default.
 
llvm-svn: 272749 | 
| | 
| 
| 
| | llvm-svn: 272740 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Emitting symbol information requires us to have a definition for the
symbol.  A symbol reference is insufficient.
This fixes PR28123.
llvm-svn: 272738 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | profile.
Summary: With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally.
Reviewers: djasper, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20991
llvm-svn: 272729 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.
This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)
Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.
See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.
Part of the fix for PR27553.
Differential Revision: http://reviews.llvm.org/D20348
llvm-svn: 272709 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Split NumInstrDups statistic into separate added/removed counts to avoid
negative stat being printed as unsigned.
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D21335
llvm-svn: 272700 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | from i8 or i16
For <N x i32> type mul, pmuludq will be used for targets without SSE41, which
often introduces many extra pack and unpack instructions in vectorized loop
body because pmuludq generates <N/2 x i64> type value. However when the operands
of <N x i32> mul are extended from smaller size values like i8 and i16, the type
of mul may be shrunk to use pmullw + pmulhw/pmulhuw instead of pmuludq, which
generates better code. For targets with SSE41, pmulld is supported so no
shrinking is needed.
Differential Revision: http://reviews.llvm.org/D20931
llvm-svn: 272694 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Change EmitGlobalVariable to check final assembler section is in BSS
before using .lcomm/.comm directive. This prevents globals from being
put into .bss erroneously when -data-sections is used.
This fixes PR26570.
Reviewers: echristo, rafael
Subscribers: llvm-commits, mehdi_amini
Differential Revision: http://reviews.llvm.org/D21146
llvm-svn: 272674 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The exit-on-error flag in the ARM test is necessary in order to avoid an
unreachable in the DAGTypeLegalizer, when trying to expand a physical register.
We can also avoid this situation by introducing a bitcast early on, where the
invalid scalar-to-vector conversion is detected.
We also add a test for PowerPC, which goes through a similar code path in the
SelectionDAGBuilder.
Fixes PR27765.
Differential Revision: http://reviews.llvm.org/D21061
llvm-svn: 272644 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Save machine function pointer so that
the reference does not need to be passed around.
This also gives other methods access to machine
function for information such as entry count etc.
 
llvm-svn: 272594 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is third patch to clean up the code.
Included in this patch:
1. Further unclutter trace/chain formation main routine;
2. Isolate the logic to compute global cost/conflict detection
   into its own method;
3. Heavily document the selection algorithm;
4. Added helper hook to allow PGO specific logic to be
   added in the future.
 
llvm-svn: 272582 | 
| | 
| 
| 
| 
| 
| | constant in soft float mode on PowerPC 32 architecture.
llvm-svn: 272543 | 
| | 
| 
| 
| 
| 
| | No functionality change intended.
llvm-svn: 272516 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is second patch to clean up the code.
In this patch, the logic to determine block outlinining
is refactored and more comments are added.
 
llvm-svn: 272514 | 
| | 
| 
| 
| 
| 
| | Or replace with llvm::function_ref if it's never stored. NFC intended.
llvm-svn: 272513 | 
| | 
| 
| 
| 
| 
| 
| 
| | This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512 | 
| | 
| 
| 
| | llvm-svn: 272509 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is one of the patches to clean up the code so that
it is in a better form to make future enhancements easier.
In htis patch, the logic to collect viable successors are
extrated as a helper to unclutter the caller which gets very
large recenty. Also cleaned up BP adjustment code.
 
llvm-svn: 272482 | 
| | 
| 
| 
| | llvm-svn: 272458 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | undef uses are no real uses of a register and must be ignored by
findLastUseBefore() so that handleMove() does not produce invalid live
intervals in some cases.
This fixed http://llvm.org/PR28083
llvm-svn: 272446 | 
| | 
| 
| 
| 
| 
| | Now or instructions get translated into G_OR.
llvm-svn: 272433 | 
| | 
| 
| 
| 
| 
| 
| 
| | This method will be used for every binary operation.
NFC.
llvm-svn: 272431 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
When stack-protection is activated and WinEH exceptions is used, 
the EHRegNode (exception handling registration) is allocated twice on the stack.
This was not breaking anything except loosing space on the stack.
```
D:\src\llvm\examples>llc exc2.ll  -debug-only=pei
alloc FI(0) at SP[-24]
alloc FI(1) at SP[-48]   <<-- Allocated
alloc FI(1) at SP[-72]   <<-- Allocated twice!?
alloc FI(2) at SP[-76]
alloc FI(4) at SP[-80]
alloc FI(3) at SP[-84]
```
Reviewers: rnk, majnemer
Subscribers: chrisha, llvm-commits
Differential Revision: http://reviews.llvm.org/D21188
llvm-svn: 272426 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Adds a MachineFunctionPass that scans the body to find calls, and
update the register mask with the one saved by the
RegUsageInfoCollector analysis in PhysicalRegisterUsageInfo.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D21180
llvm-svn: 272414 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Add an option to enable the analysis of MachineFunction register
usage to extract the list of clobbered registers.
When enabled, the CodeGen order is changed to be bottom up on the Call
Graph.
The analysis is split in two parts, RegUsageInfoCollector is the
MachineFunction Pass that runs post-RA and collect the list of
clobbered registers to produce a register mask.
An immutable pass, RegisterUsageInfo, stores the RegMask produced by
RegUsageInfoCollector, and keep them available. A future tranformation
pass will use this information to update every call-sites after
instruction selection.
Patch by Vivek Pandya <vivekvpandya@gmail.com>
Differential Revision: http://reviews.llvm.org/D20769
llvm-svn: 272403 | 
| | 
| 
| 
| 
| 
| | No tests break with this enabled.
llvm-svn: 272340 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When we delete a live-range, we check if that live-range is the origin of others
to keep it around for rematerialization. For that we check that the instruction
we are about to remove is the same as the definition of the VNI of the original
live-range.
If this is the case, we just shrink the live-range to an empty one.
Now, when we try to delete one of the children of such live-range (product of
splitting), we do the same check.
However, now the original live-range is empty and there is no way we can
access the VNI to check its definition, and we crash.
When we cannot get the VNI for the original live-range, that means we are not in
the presence of the original definition. Thus, this check does not need to happen
in that case and the crash is sloved!
This bug was introduced in r266162 | wmi | 2016-04-12 20:08:27. It affects every
target that uses the greedy register allocator.
To happen, we need to delete both a the original instruction and its split
products, in that order. This is likely to happen when rematerialization comes
into play.
Trying to produce a more robust test case. Will follow in a coming commit.
This fixes llvm.org/PR27983.
rdar://problem/26651519 
llvm-svn: 272314 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG.
Reviewers: arsenm
Differential Revision: http://reviews.llvm.org/D17898
llvm-svn: 272272 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This reapplies commit r271930, r271915, r271923.  They hit a bug in
Thumb which is fixed in r272258 now.
The original message:
The code layout that TailMerging (inside BranchFolding) works on is not the
final layout optimized based on the branch probability. Generally, after
BlockPlacement, many new merging opportunities emerge.
This patch calls Tail Merging after MBP and calls MBP again if Tail Merging
merges anything.
llvm-svn: 272267 | 
| | 
| 
| 
| 
| 
| 
| | Fixes a crash in the backend during an LTO build of rtld(1) in
FreeBSD.
llvm-svn: 272262 | 
| | 
| 
| 
| 
| 
| | They have probably been discarded during optimization.
llvm-svn: 272231 | 
| | 
| 
| 
| 
| 
| 
| 
| | Without that check it was possible to write test cases where the size
was not specified and we ended up with weird asserts down the road,
because the default value (1) would not make sense.
llvm-svn: 272226 | 
| | 
| 
| 
| 
| 
| | This improves the debuggability of the pass.
llvm-svn: 272210 | 
| | 
| 
| 
| 
| 
| 
| 
| | For complex rewrittings, which do not occur currently, the related
machine instruction may have been deleted in the process. Therefore, do
not try to print it after the mapping is applied.
llvm-svn: 272209 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | computation of the end of range.
Refactor the code so that we do not compute in two different places the
end iterator for the range of new virtual registers for a given operand.
Although this refactoring was intended as NFC, this is not the case
because it actually fixes a bug where we were returning a range off by 1
(too long). Right now, this could not result in an actual bug because we
were accessing this range via the BreakDown size of the related operand.
llvm-svn: 272208 | 
| | 
| 
| 
| 
| 
| | Improve debuggability of the OperandsMapper helper class.
llvm-svn: 272207 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | backward-hot-prob consistently.
Summary:
Consider the following diamond CFG:
 A
/ \
B C
 \/
 D
Suppose A->B and A->C have probabilities 81% and 19%. In block-placement, A->B is called a hot edge and the final placement should be ABDC. However, the current implementation outputs ABCD. This is because when choosing the next block of B, it checks if Freq(C->D) > Freq(B->D) * 20%, which is true (if Freq(A) = 100, then Freq(B->D) = 81, Freq(C->D) = 19, and 19 > 81*20%=16.2). Actually, we should use 25% instead of 20% as the probability here, so that we have 19 < 81*25%=20.25, and the desired ABDC layout will be generated.
Reviewers: djasper, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20989
llvm-svn: 272203 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Now DISubroutineType has a 'cc' field which should be a DW_CC_ enum.  If
it is present and non-zero, the backend will emit it as a
DW_AT_calling_convention attribute. On the CodeView side, we translate
it to the appropriate enum for the LF_PROCEDURE record.
I added a new LLVM vendor specific enum to the list of DWARF calling
conventions. DWARF does not appear to attempt to standardize these, so I
assume it's OK to do this until we coordinate with GCC on how to emit
vectorcall convention functions.
Reviewers: dexonsmith, majnemer, aaboud, amccarth
Subscribers: mehdi_amini, llvm-commits
Differential Revision: http://reviews.llvm.org/D21114
llvm-svn: 272197 | 
| | 
| 
| 
| 
| 
| 
| | Avoids unnecessary copies. All changes audited & pass tests with asan.
No functional change intended.
llvm-svn: 272190 | 
| | 
| 
| 
| 
| 
| | Differential Revision: http://reviews.llvm.org/D21107
llvm-svn: 272187 | 
| | 
| 
| 
| | llvm-svn: 272177 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | When repairing with a copy, instead of accounting for the cost of that
copy and actually inserting it, we may be able to use an alternative
source for the register to repair and just use it.
Make sure this is documented, so that we consider that opportunity at
some point.
llvm-svn: 272176 | 
| | 
| 
| 
| 
| 
| 
| | The RegBankSelect pass can now rely on the target to do the remapping of
the instructions.
llvm-svn: 272169 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Now, the target will be able to provide its how implementation to remap
an instruction. This open the way to crazier optimizations, but to
beginning with, we will be able to handle something else than the
default mapping.
llvm-svn: 272165 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Now that we have an entity that hold the remap information the
rewritting should be easier to do.
No functional changes.
llvm-svn: 272164 | 
| | 
| 
| 
| 
| 
| 
| | The repairing code has no reason to change the source or destination of
the registers.
llvm-svn: 272163 | 
| | 
| 
| 
| 
| 
| 
| | This helper class is used to encapsulate the necessary information
to remap an instruction.
llvm-svn: 272161 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | When the command line option is set, it overrides any thing that the
target may have set. The rationale is that we get what we asked for.
Options are respectively regbankselect-fast and regbankselect-greedy for
fast and greedy mode.
llvm-svn: 272158 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | repairing.
Copies are easy because we repair only when there is a mismatch. For
non-copy repairing, i.e., cases that involves breaking down or gathering
up the value, one of the operand may not have a register bank yet. Thus,
derivate a cost from that, requires more work.
llvm-svn: 272157 |