summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPURewriteOutArguments.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Rename DEBUG macro to LLVM_DEBUG.Nicola Zaghen2018-05-141-5/+6
| | | | | | | | | | | | | | | | The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240
* [AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-081-13/+32
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 310429
* [AMDGPU] Put a function used only inside assert() under NDEBUG.Davide Italiano2017-08-011-0/+4
| | | | llvm-svn: 309723
* AMDGPU: Look through a bitcast user of an out argumentMatt Arsenault2017-07-281-16/+101
| | | | | | | | | | | | | | This allows handling of a lot more of the interesting cases in Blender. Most of the large functions unlikely to be inlined have this pattern. This is a special case for what clang emits for OpenCL 3 element vectors. Annoyingly, these are emitted as <3 x elt>* pointers, but accessed as <4 x elt>* operations. This also needs to handle cases where a struct containing a single vector is used. llvm-svn: 309419
* AMDGPU: Add pass to replace out argumentsMatt Arsenault2017-07-281-0/+375
It is better to return arguments directly in registers if we are making a call rather than introducing expensive stack usage. In one of sample compile from one of Blender's many kernel variants, this fires on about ~20 different functions. Future improvements may be to recognize simple cases where the pointer is indexing a small array. This also fails when the store to the out argument is in a separate block from the return, which happens in a few of the Blender functions. This should also probably be using MemorySSA which might help with that. I'm not sure this is correct as a FunctionPass, but MemoryDependenceAnalysis seems to not work with a ModulePass. I'm also not sure where it should run.I think it should run before DeadArgumentElimination, so maybe either EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate. llvm-svn: 309416
OpenPOWER on IntegriCloud