summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/ftrunc.f64.ll
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Add and update scalar instructionsGraham Sellers2018-11-291-2/+1
| | | | | | | | | This patch adds support for S_ANDN2, S_ORN2 32-bit and 64-bit instructions and adds splits to move them to the vector unit (for which there is no equivalent instruction). It modifies the way that the more complex scalar instructions are lowered to vector instructions by first breaking them down to sequences of simpler scalar instructions which are then lowered through the existing code paths. The pattern for S_XNOR has also been updated to apply inversion to one input rather than the output of the XOR as the result is equivalent and may allow leaving the NOT instruction on the scalar unit. A new tests for NAND, NOR, ANDN2 and ORN2 have been added, and existing tests now hit the new instructions (and have been modified accordingly). Differential: https://reviews.llvm.org/D54714 llvm-svn: 347877
* AMDGPU: Remove -mcpu=SIMatt Arsenault2017-08-071-3/+3
| | | | | | Leftover from before amdgcn/r600 split. llvm-svn: 310277
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-041-3/+3
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307097
* Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"NAKAMURA Takumi2017-07-041-3/+3
| | | | | | | | | It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll llvm-svn: 307054
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-031-3/+3
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-7/+7
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* AMDGPU: Define a schedule class for COPY.Matthias Braun2016-06-241-2/+2
| | | | | | | | | | COPY was lacking a scheduling class, define it to avoid regressions in the upcoming change to the bidirectional MachineScheduler. Approved by tstellar on IRC. Differential Revision: http://reviews.llvm.org/D21540 llvm-svn: 273751
* AMDGPU: Use s_addk_i32 / s_mulk_i32Matt Arsenault2016-04-161-2/+2
| | | | llvm-svn: 266506
* AMDGPU/SI: Improve MachineSchedModel definitionTom Stellard2016-03-301-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | This patch contains a few improvements to the model, including: - Using a single resource with a defined buffers size for each memory unit. - Setting the IssueWidth correctly. - Fixing latency values for memory instructions. shader-db stats: 16429 shaders in 3231 tests Totals: SGPRS: 318232 -> 312328 (-1.86 %) VGPRS: 208996 -> 209346 (0.17 %) Code Size: 7147044 -> 7166440 (0.27 %) bytes LDS: 83 -> 83 (0.00 %) blocks Scratch: 1862656 -> 1459200 (-21.66 %) bytes per wave Max Waves: 49182 -> 49243 (0.12 %) Wait states: 0 -> 0 (0.00 %)A Differential Revision: http://reviews.llvm.org/D18453 llvm-svn: 264877
* Revert "Remove unnecessary call to getAllocatableRegClass"Tom Stellard2015-11-121-2/+2
| | | | | | | | | | | | | This reverts commit r252565. This also includes the revert of the commit mentioned below in order to avoid breaking tests in AMDGPU: Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64" This reverts commit r252674. llvm-svn: 252956
* AMDGPU: Set isAllocatable = 0 on VS_32/VS_64Matt Arsenault2015-11-111-2/+2
| | | | llvm-svn: 252674
* DAGCombiner: Combine extract_vector_elt from build_vectorMatt Arsenault2015-10-121-6/+6
| | | | | | | | | | | | | | This basic combine was surprisingly missing. AMDGPU legalizes many operations in terms of 32-bit vector components, so not doing this results in many extra copies and subregister extracts that need to be cleaned up later. InstCombine already does this for the hasOneUse case. The target hook is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn from a vector materialize repeated immediate instruction to a constant vector load with more scalar copies from it. llvm-svn: 250129
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+111
llvm-svn: 239657
OpenPOWER on IntegriCloud