| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, llvm-commits, rafael, yaron.keren
Differential Revision: http://reviews.llvm.org/D11040
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241778
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, llvm-commits, rafael, yaron.keren
Differential Revision: http://reviews.llvm.org/D11037
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241776
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D11028
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DataLayout is no longer optional. It was initialized with or without
a DataLayout, and the DataLayout when supplied could have been the
one from the TargetMachine.
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, llvm-commits, rafael, yaron.keren
Differential Revision: http://reviews.llvm.org/D11021
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241774
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Avoid using the TargetMachine owned DataLayout and use the Module owned
one instead. This requires passing the DataLayout up the stack to
ComputeValueVTs().
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, yaron.keren, rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D11019
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241773
|
|
|
|
|
|
| |
Remove commented lines, trailing whitespace, etc.
llvm-svn: 241687
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This concludes the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
At this point, the StringRef-form of GNU Triples should only be used in the
public API (including IR serialization) and a couple objects that directly
interact with the API (most notably the Module class). The next step is to
replace these Triple objects with the TargetTuple object that will represent
our authoratative/unambiguous internal equivalent to GNU Triples.
Reviewers: rengolin
Subscribers: llvm-commits, jholewinski, ted, rengolin
Differential Revision: http://reviews.llvm.org/D10962
llvm-svn: 241472
|
|
|
|
|
|
|
|
| |
There is some functional change here because it changes target code from
atoi(3) to StringRef::getAsInteger which has error checking. For valid
constraints there should be no difference.
llvm-svn: 241411
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
According to PTX ISA:
For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type.
So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of.
As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like
ld.v2.f32 {%fd3, %fd4}, [%rd2]
This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated.
Patched by Gang Hu.
Test Plan: Test case attached.
Reviewers: jingyue
Reviewed By: jingyue
Subscribers: llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D10876
llvm-svn: 241191
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Offset of frame index is calculated by NVPTXPrologEpilogPass. Before
that the correct offset of stack objects cannot be obtained, which
leads to wrong offset if there are more than 2 frame objects. This patch
move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index
is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check
VRFrame register instead, and try to remove the VRFrame if there
is no usage after NVPTXPeephole pass.
Patched by Xuetian Weng.
Test Plan:
Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the
offset calculation based on SP and SPL.
Reviewers: jholewinski, jingyue
Reviewed By: jingyue
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10853
llvm-svn: 241185
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: NFC
Test Plan: no regression
Reviewers: wengxt
Reviewed By: wengxt
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10849
llvm-svn: 241118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Really check if %SP is not used in other places, instead of checking only exact
one non-dbg use.
Patched by Xuetian Weng.
Test Plan:
@foo4 in test/CodeGen/NVPTX/local-stack-frame.ll, create a case that
SP will appear twice.
Reviewers: jholewinski, jingyue
Reviewed By: jingyue
Subscribers: llvm-commits, sfantao, jholewinski
Differential Revision: http://reviews.llvm.org/D10844
llvm-svn: 241099
|
|
|
|
|
|
| |
backend.
llvm-svn: 241081
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Some front ends make kernel pointers global already. In that case,
handlePointerParams does nothing.
Test Plan: more tests in lower-kernel-ptr-arg.ll
Reviewers: grosser
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10779
llvm-svn: 240849
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch first change the register that holds local address for stack
frame to %SPL. Then the new NVPTXPeephole pass will try to scan the
following pattern
%vreg0<def> = LEA_ADDRi64 <fi#0>, 4
%vreg1<def> = cvta_to_local %vreg0
and transform it into
%vreg1<def> = LEA_ADDRi64 %VRFrameLocal, 4
Patched by Xuetian Weng
Test Plan: test/CodeGen/NVPTX/local-stack-frame.ll
Reviewers: jholewinski, jingyue
Reviewed By: jingyue
Subscribers: eliben, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10549
llvm-svn: 240587
|
|
|
|
|
|
|
| |
We only need to pass in a DataLayout when mangling a raw string, not when
constructing the mangler.
llvm-svn: 240405
|
|
|
|
|
|
| |
Apparently, the style needs to be agreed upon first.
llvm-svn: 240390
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is done by first adding two additional instructions to convert the
alloca returned address to local and convert it back to generic. Then
replace all uses of alloca instruction with the converted generic
address. Then we can rely NVPTXFavorNonGenericAddrSpace pass to combine
the generic addresscast and the corresponding Load, Store, Bitcast, GEP
Instruction together.
Patched by Xuetian Weng (xweng@google.com).
Test Plan: test/CodeGen/NVPTX/lower-alloca.ll
Reviewers: jholewinski, jingyue
Reviewed By: jingyue
Subscribers: meheff, broune, eliben, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10483
llvm-svn: 239964
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Reviewers: rengolin
Reviewed By: rengolin
Subscribers: llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10382
llvm-svn: 239823
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For the moment, TargetMachine::getTargetTriple() still returns a StringRef.
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rengolin
Reviewed By: rengolin
Subscribers: ted, llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10362
llvm-svn: 239554
|
|
|
|
| |
llvm-svn: 239553
|
|
|
|
|
|
|
|
|
|
| |
It hasn't been used since r130964.
This also removes MachineModuleInfo::isUsedFunction and
MachineModuleInfo::AnalyzeModule, both of which were only
there to support UsedFunctions.
llvm-svn: 239501
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
create*MCSubtargetInfo(). NFC.
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rafael
Reviewed By: rafael
Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10311
llvm-svn: 239467
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We used to assume V->RAUW only modifies the operand list of V's user.
However, if V and V's user are Constants, RAUW may replace and invalidate V's
user entirely.
This patch fixes the above issue by letting the caller replace the
operand instead of calling RAUW on Constants.
Test Plan: @nested_const_expr and @rauw in access-non-generic.ll
Reviewers: broune, jholewinski
Reviewed By: broune, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10345
llvm-svn: 239435
|
|
|
|
|
|
|
|
|
|
| |
array of bytes. The generation of this byte arrays was expecting
the host to be little endian, which prevents big endian hosts to be
used in the generation of the PTX code. This patch fixes the
problem by changing the way the bytes are extracted so that it
works for either little and big endian.
llvm-svn: 239412
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
NVPTX doesn't seem to support any additional constraints. Therefore remove
the target hook.
No functional change intended.
Reviewers: jholewinski
Reviewed By: jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D8209
llvm-svn: 239395
|
|
|
|
| |
llvm-svn: 239370
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This cleans up most allocas NVPTXLowerKernelArgs emits for byval
parameters.
Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store.
Reviewers: eliben, jholewinski
Reviewed By: eliben, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10322
llvm-svn: 239368
|
|
|
|
|
|
|
|
|
| |
NVPTXISelDAGToDAG translates "addrspacecast to param" to
NVPTX::nvvm_ptr_gen_to_param
Added an llc test in bug21465.
llvm-svn: 239100
|
|
|
|
|
|
| |
llc crashed for NVPTX backend
llvm-svn: 239094
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a
pointer in the global address space. This change, along with
NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.*
and st.global.* for accessing kernel pointer arguments.
Minor changes:
1. refactor: extract function convertToPointerInAddrSpace
2. fix a bug in the test case in bug21465.ll
Test Plan: lower-kernel-ptr-arg.ll
Reviewers: eliben, meheff, jholewinski
Reviewed By: jholewinski
Subscribers: wengxt, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10154
llvm-svn: 239082
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
create*AsmInfo(). NFC.
Summary:
This is the first of several patches to eliminate StringRef forms of GNU
triples from the internals of LLVM. After this is complete, GNU triples
will be replaced by a more authoratitive representation in the form of
an LLVM TargetTuple.
Reviewers: rengolin
Reviewed By: rengolin
Subscribers: ted, llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10236
llvm-svn: 239036
|
|
|
|
|
|
| |
NFC.
llvm-svn: 238843
|
|
|
|
|
|
|
|
|
|
| |
This is important because of different addressing modes
depending on the address space for GPU targets.
This only adds the argument, and does not update
any of the uses to provide the correct address space.
llvm-svn: 238723
|
|
|
|
| |
llvm-svn: 238634
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch allows NVPTXFavorNonGenericAddrSpaces to remove addrspacecast
from longer chains consisting of GEPs and BitCasts. For example, it can
now optimize
%0 = addrspacecast [10 x float] addrspace(3)* @a to [10 x float]*
%1 = gep [10 x float]* %0, i64 0, i64 %i
%2 = bitcast float* %1 to i32*
%3 = load i32* %2 ; emits ld.u32
to
%0 = gep [10 x float] addrspace(3)* @a, i64 0, i64 %i
%1 = bitcast float addrspace(3)* %0 to i32 addrspace(3)*
%3 = load i32 addrspace(3)* %1 ; emits ld.shared.f32
Test Plan: @ld_int_from_global_float in access-non-generic.ll
Reviewers: broune, eliben, jholewinski, meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10074
llvm-svn: 238574
|
|
|
|
|
|
| |
Creating temporary std::strings there is unnecessary.
llvm-svn: 238412
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch made two improvements to NaryReassociate and the NVPTX pipeline
1. Run EarlyCSE/GVN after NaryReassociate to get rid of redundant common
expressions.
2. When adding an instruction to SeenExprs, maps both the SCEV before and after
reassociation to that instruction.
Test Plan: updated @reassociate_gep_nsw in nary-gep.ll
Reviewers: meheff, broune
Reviewed By: broune
Subscribers: dberlin, jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D9947
llvm-svn: 238396
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This starts merging MCSection and MCSectionData.
There are a few issues with the current split between MCSection and
MCSectionData.
* It optimizes the the not as important case. We want the production
of .o files to be really fast, but the split puts the information used
for .o emission in a separate data structure.
* The ELF/COFF/MachO hierarchy is not represented in MCSectionData,
leading to some ad-hoc ways to represent the various flags.
* It makes it harder to remember where each item is.
The attached patch starts merging the two by moving the alignment from
MCSectionData to MCSection.
Most of the patch is actually just dropping 'const', since
MCSectionData is mutable, but MCSection was not.
llvm-svn: 237936
|
|
|
|
|
|
|
| |
The naming was a mish-mash of old and new style. Update to be consistent
with the new. NFC.
llvm-svn: 237594
|
|
|
|
| |
llvm-svn: 237483
|
|
|
|
| |
llvm-svn: 237481
|
|
|
|
|
|
| |
s/InitMCCodeGenInfo/initMCCodeGenInfo/
llvm-svn: 237471
|
|
|
|
|
|
| |
MCOperand::Create*() methods renamed to MCOperand::create*().
llvm-svn: 237275
|
|
|
|
|
|
| |
fix them
llvm-svn: 236775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a new pass that computes the safe point to insert the
prologue and epilogue of the function.
The interest is to find safe points that are cheaper than the entry and exits
blocks.
As an example and to avoid regressions to be introduce, this patch also
implements the required bits to enable the shrink-wrapping pass for AArch64.
** Context **
Currently we insert the prologue and epilogue of the method/function in the
entry and exits blocks. Although this is correct, we can do a better job when
those are not immediately required and insert them at less frequently executed
places.
The job of the shrink-wrapping pass is to identify such places.
** Motivating example **
Let us consider the following function that perform a call only in one branch of
a if:
define i32 @f(i32 %a, i32 %b) {
%tmp = alloca i32, align 4
%tmp2 = icmp slt i32 %a, %b
br i1 %tmp2, label %true, label %false
true:
store i32 %a, i32* %tmp, align 4
%tmp4 = call i32 @doSomething(i32 0, i32* %tmp)
br label %false
false:
%tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ]
ret i32 %tmp.0
}
On AArch64 this code generates (removing the cfi directives to ease
readabilities):
_f: ; @f
; BB#0:
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
LBB0_2: ; %false
mov sp, x29
ldp x29, x30, [sp], #16
ret
With shrink-wrapping we could generate:
_f: ; @f
; BB#0:
cmp w0, w1
b.ge LBB0_2
; BB#1: ; %true
stp x29, x30, [sp, #-16]!
mov x29, sp
sub sp, sp, #16 ; =16
stur w0, [x29, #-4]
sub x1, x29, #4 ; =4
mov w0, wzr
bl _doSomething
add sp, x29, #16 ; =16
ldp x29, x30, [sp], #16
LBB0_2: ; %false
ret
Therefore, we would pay the overhead of setting up/destroying the frame only if
we actually do the call.
** Proposed Solution **
This patch introduces a new machine pass that perform the shrink-wrapping
analysis (See the comments at the beginning of ShrinkWrap.cpp for more details).
It then stores the safe save and restore point into the MachineFrameInfo
attached to the MachineFunction.
This information is then used by the PrologEpilogInserter (PEI) to place the
related code at the right place. This pass runs right before the PEI.
Unlike the original paper of Chow from PLDI’88, this implementation of
shrink-wrapping does not use expensive data-flow analysis and does not need hack
to properly avoid frequently executed point. Instead, it relies on dominance and
loop properties.
The pass is off by default and each target can opt-in by setting the
EnableShrinkWrap boolean to true in their derived class of TargetPassConfig.
This setting can also be overwritten on the command line by using
-enable-shrink-wrap.
Before you try out the pass for your target, make sure you properly fix your
emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not
necessarily the entry block.
** Design Decisions **
1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but
for debugging and clarity I thought it was best to have its own file.
2. Right now, we only support one save point and one restore point. At some
point we can expand this to several save point and restore point, the impacted
component would then be:
- The pass itself: New algorithm needed.
- MachineFrameInfo: Hold a list or set of Save/Restore point instead of one
pointer.
- PEI: Should loop over the save point and restore point.
Anyhow, at least for this first iteration, I do not believe this is interesting
to support the complex cases. We should revisit that when we motivating
examples.
Differential Revision: http://reviews.llvm.org/D9210
<rdar://problem/3201744>
llvm-svn: 236507
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`. The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.
Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one. It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs. YMMV of
course.
Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py. I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three. It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).
Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.
llvm-svn: 236120
|
|
|
|
|
|
| |
NVPTX overrides.
llvm-svn: 236007
|
|
|
|
|
|
|
|
|
|
|
| |
We need to track if an AddrSpaceCast expression was seen when
generating an MCExpr for a ConstantExpr. This change introduces a
custom lowerConstant method to the NVPTX asm printer that will create
NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode
the information that a given symbol needs to be casted to a generic
address.
llvm-svn: 236000
|