summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/Inline/AMDGPU
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Tune inlining parameters for AMDGPU targetDaniil Fukalov2019-07-171-0/+31
| | | | | | | | | | | | | | | | | | | Summary: Since the target has no significant advantage of vectorization, vector instructions bous threshold bonus should be optional. amdgpu-inline-arg-alloca-cost parameter default value and the target InliningThresholdMultiplier value tuned then respectively. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64642 llvm-svn: 366348
* [AMDGPU] Use InliningThresholdMultiplier for inline hintStanislav Mekhanoshin2019-05-311-0/+77
| | | | | | | | | | | | | | AMDGPU uses multiplier 9 for the inline cost. It is taken into account everywhere except for inline hint threshold. As a result we are penalizing functions with the inline hint making them less probable to be inlined than those without the hint. Defaults are 225 for a normal function and 325 for a function with an inline hint. Currently we have effective threshold 225 * 9 = 2025 for normal functions and just 325 for those with the hint. That is fixed by this patch. Differential Revision: https://reviews.llvm.org/D62707 llvm-svn: 362239
* AMDGPU: Boost inline threshold with addrspacecasted alloca argumentsMatt Arsenault2019-05-241-0/+70
| | | | | | | This was skipping GetUnderlyingObject for nonprivate addresses, but an alloca could also be found through an addrspacecast if it's flat. llvm-svn: 361649
* AMDGPU: Assume xnack is enabled by defaultMatt Arsenault2019-05-161-0/+67
| | | | | | | | | | | This is the conservatively correct default. It is always safe to assume xnack is enabled, but not the converse. Introduce a feature to blacklist targets where xnack can never be meaningfully enabled. I'm not sure the targets this is applied to is 100% correct. llvm-svn: 360903
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-175-0/+372
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-175-372/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* AMDGPU: Assume ECC is enabled by default if supportedMatt Arsenault2019-04-031-0/+70
| | | | | | | | | | The test should really be checking for the property directly in the code object headers, but there are problems with this. I don't see this directly represented in the text form, and for the binary emission this is depending on a function level subtarget feature to emit a global flag. llvm-svn: 357558
* AMDGPU: Fix test filenameMatt Arsenault2019-04-021-0/+0
| | | | llvm-svn: 357441
* AMDGPU: Remove dx10-clamp from subtarget featuresMatt Arsenault2019-03-292-0/+197
| | | | | | | | | | | | | | | | | | Since this can be set with s_setreg*, it should not be a subtarget property. Set a default based on the calling convention, and Introduce a new amdgpu-dx10-clamp attribute to override this if desired. Also introduce a new amdgpu-ieee attribute to match. The values need to match to allow inlining. I think it is OK for the caller's dx10-clamp attribute to override the callee, but there doesn't appear to be the infrastructure to do this currently without definining the attribute in the generic Attributes.td. Eventually the calling convention lowering will need to insert a mode switch somewhere for these. llvm-svn: 357302
* AMDGPU: Ignore CodeObjectV3 when inliningMatt Arsenault2019-02-121-0/+13
| | | | | | | | | | This was inhibiting inlining of library functions when clang was invoking the inliner directly. This is covering a bit of a mess with subtarget feature handling, and this shouldn't be a subtarget feature. The behavior is different depending on whether you are using a -mattr flag in clang, or llc, opt. llvm-svn: 353899
* AMDGPU: Use a custom areInlineCompatibleMatt Arsenault2017-08-072-0/+92
Fixes not inlining OpenCL library functions on AMDGPU, which don't have an explicitly set target-cpu. llvm-svn: 310269
OpenPOWER on IntegriCloud