summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Use a custom encoding method for simm16 in SOPP branch instructionsTom Stellard2014-07-216-14/+89
| | | | | | | This allows us to explicitly define the type of fixup that is needed, so we can distinguish this from future fixup types. llvm-svn: 213527
* R600/SI: Rename SOPP operands to match the encoding fieldsTom Stellard2014-07-212-19/+19
| | | | llvm-svn: 213526
* SIISelLowering.cpp: Define _USE_MATH_DEFINES to let M_PI provided on MS <cmath>.NAKAMURA Takumi2014-07-201-0/+6
| | | | | FIXME: Would it be better to move it into configure? llvm-svn: 213477
* R600: Remove unused functionMatt Arsenault2014-07-204-11/+1
| | | | llvm-svn: 213472
* R600/SI: Remove dead code and add missing tests.Matt Arsenault2014-07-201-14/+0
| | | | | | | | This probably was killed by some generic DAGCombiner improvements in checking the TargetBooleanContents instead of just 1. llvm-svn: 213471
* Revert accidentally committed r213459Matt Arsenault2014-07-191-3/+1
| | | | llvm-svn: 213461
* XXX - Increase unroll thresholdMatt Arsenault2014-07-191-1/+3
| | | | llvm-svn: 213459
* R600/SI: implement range reduction for sin/cosMatt Arsenault2014-07-194-12/+33
| | | | | | | | | | | | | | | | These instructions can only take a limited input range, and return the constant value 1 out of range. We should do range reduction to be able to process arbitrary values. Use a FRACT instruction after normalization to achieve this. Also add a test for constant folding with the lowered code with unsafe-fp-math enabled. v2: use DAG lowering instead of intrinsic, adapt test v3: calculate constant, fold pattern into instruction definition v4: misc style fixes, add sin-fold testcase, cosmetics Patch by Grigori Goronzy llvm-svn: 213458
* R600: Implement a few simple TTI queries.Matt Arsenault2014-07-191-0/+24
| | | | | | I'm not sure if these have any effect right now. llvm-svn: 213455
* R600: support fpext/fptrunc operations to and from f16.Tim Northover2014-07-181-0/+4
| | | | llvm-svn: 213376
* R600: support f16 -> f64 conversion intrinsic.Tim Northover2014-07-181-0/+2
| | | | | | | | Unfortunately, we don't seem to have a direct truncation, but the extension can be legally split into two operations so we should support that. llvm-svn: 213357
* R600: Implement TTI:getPopcntSupportMatt Arsenault2014-07-182-2/+12
| | | | | | | The test is just copied from X86, and I don't know of a better way to test it. llvm-svn: 213351
* Fix typosMatt Arsenault2014-07-171-3/+3
| | | | llvm-svn: 213285
* CodeGen: extend f16 conversions to permit types > float.Tim Northover2014-07-171-2/+2
| | | | | | | | | | | | | | | | | | | This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. llvm-svn: 213248
* Use range forMatt Arsenault2014-07-171-6/+4
| | | | llvm-svn: 213230
* R600: Short circuit alloca check if address space isn't private.Matt Arsenault2014-07-171-1/+1
| | | | | | | Skip calling GetUnderlyingObject in cases where it obviously isn't from an alloca. This should only be a compile time improvement. llvm-svn: 213229
* R600/SI: Allow using f32 rcp / rsq when denormals not handled.Matt Arsenault2014-07-153-10/+31
| | | | | | | These are precise enough to use for OpenCL unless denormals are handled. llvm-svn: 213107
* R600/SI: Fix select on i1Matt Arsenault2014-07-151-0/+3
| | | | llvm-svn: 213096
* R600/SI: Implement less wrong f32 fdivMatt Arsenault2014-07-153-7/+83
| | | | | | | Assuming single precision denormals and accurate sqrt/div are not reported, this passes the OpenCL conformance test. llvm-svn: 213089
* R600: Add predicate for UnsafeFPMathMatt Arsenault2014-07-151-0/+1
| | | | llvm-svn: 213088
* R600: Remove intrinsics that appear to be unusedMatt Arsenault2014-07-151-3/+0
| | | | llvm-svn: 213087
* R600: Implement zero undef variants of ctlz/cttzJan Vesely2014-07-153-0/+17
| | | | | | | | | v2: use ffbh/l if available v3: Rebase on top of Matt's SI patches Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 213072
* Prune Redundant libdeps in CMake's target_link_libraries and LLVMBuild.txt.NAKAMURA Takumi2014-07-151-1/+1
| | | | | | I checked this with Release+Asserts on x86_64-mingw32. Please restore partially if this were overkill. llvm-svn: 213064
* R600: Add dag combine for copy of an illegal type.Matt Arsenault2014-07-152-1/+56
| | | | | | | | | This helps avoid redundant instructions to unpack, and repack the vectors. Ideally we could recognize that pattern and eliminate it. Currently v4i8 and other small element type vectors are scalarized, so this has the added bonus of avoiding that. llvm-svn: 213031
* R600: Add denormal handling subtarget features.Matt Arsenault2014-07-145-4/+56
| | | | llvm-svn: 213018
* R600/SI: Default to no single precision denormals.Matt Arsenault2014-07-141-1/+9
| | | | llvm-svn: 213017
* Remove unused includeMatt Arsenault2014-07-131-1/+0
| | | | llvm-svn: 212898
* R600: Use range for and fix missing consts.Matt Arsenault2014-07-132-29/+20
| | | | llvm-svn: 212897
* R600: Make ShaderType privateMatt Arsenault2014-07-139-34/+45
| | | | llvm-svn: 212896
* R600: Add option to disable promote allocaMatt Arsenault2014-07-134-5/+24
| | | | | | | This can make writing some tests harder, so add a flag to disable it. llvm-svn: 212893
* R600/SI: Use i32 vectors for resources and samplersMarek Olsak2014-07-112-5/+5
| | | | | | | | This affects new intrinsics only. What surprises me is that v32i8 still works. llvm-svn: 212831
* R600/SI: add sample and image intrinsics exposing all instruction fieldsMarek Olsak2014-07-112-48/+192
| | | | | | | | | | | We need the intrinsics with offsets, so why not just add them all. The R128 parameter will also be useful for reducing SGPR usage. GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent", so Mesa will probably translate those to slc, glc, etc. When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics. llvm-svn: 212830
* R600/SI: fix shadow mapping for 1D and 2D array texturesMarek Olsak2014-07-111-1/+1
| | | | | | | It was conflicting with def TEX_SHADOW_ARRAY, which also handles them. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 212829
* R600: Implement float to long/ulongJan Vesely2014-07-103-2/+18
| | | | | | | | | | | | | | Use alg. from LegalizeDAG.cpp Move Expand setting to SIISellowering v2: Extend existing tests instead of creating new ones v3: use separate LowerFPTOSINT function v4: use TargetLowering::expandFP_TO_SINT add comment about using FP_TO_SINT for uints Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 212773
* R600/SI: Add support for llvm.convert.{to|from}.fp16Matt Arsenault2014-07-101-2/+6
| | | | llvm-svn: 212676
* R600: Fix mishandling of load / store chains.Matt Arsenault2014-07-073-36/+90
| | | | | | | | Fixes various bugs with reordering loads and stores. Scalarized vector loads weren't collecting the chains at all. llvm-svn: 212473
* Fix typo, weird indentationMatt Arsenault2014-07-071-2/+4
| | | | llvm-svn: 212472
* Use cast<> instead of dyn_cast + assertMatt Arsenault2014-07-051-2/+1
| | | | llvm-svn: 212380
* Fix grammarMatt Arsenault2014-07-051-1/+1
| | | | llvm-svn: 212379
* [codegen,aarch64] Add a target hook to the code generator to controlChandler Carruth2014-07-032-3/+9
| | | | | | | | | | | | | | | | | | | | | vector type legalization strategies in a more fine grained manner, and change the legalization of several v1iN types and v1f32 to be widening rather than scalarization on AArch64. This fixes an assertion failure caused by scalarizing nodes like "v1i32 trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32. This also provides a foundation for other targets to have more granular control over how vector types are legalized. Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow some work to start taking place on top of this patch as it adds some really important hooks to the backend that I'd like to immediately start using. =] http://reviews.llvm.org/D4322 llvm-svn: 212242
* R600: Add a comment that llvm.AMDGPU.trunc is a legacy intrinsicTom Stellard2014-07-021-1/+1
| | | | llvm-svn: 212218
* R600/SI: Use a ComplexPattern for ADDR64 addressing of MUBUF loadsTom Stellard2014-07-022-37/+35
| | | | llvm-svn: 212217
* R600: Promote i64 loads to v2i32Tom Stellard2014-07-023-7/+12
| | | | llvm-svn: 212216
* R600/SI: Adjsut SGPR live ranges before register allocationTom Stellard2014-07-024-0/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SGPRs are written by instructions that sometimes will ignore control flow, which means if you have code like: if (VGPR0) { SGPR0 = S_MOV_B32 0 } else { SGPR0 = S_MOV_B32 1 } The value of SGPR0 will 1 no matter what the condition is. In order to deal with this situation correctly, we need to view the program as if it were a single basic block when we calculate the live ranges for the SGPRs. They way we actually update the live range is by iterating over all of the segments in each LiveRange object and setting the end of each segment equal to the start of the next segment. So a live range like: [3888r,9312r:0)[10032B,10384B:0) 0@3888r will become: [3888r,10032B:0)[10032B,10384B:0) 0@3888r This change will allow us to use SALU instructions within branches. llvm-svn: 212215
* R600/SI: Add verifier check for immediates in register operands.Tom Stellard2014-07-024-2/+33
| | | | llvm-svn: 212214
* R600: Fix crashes when an illegal type load or store is not handled.Matt Arsenault2014-07-021-2/+6
| | | | | | | I don't think anything hits this now, but will be exposed in future patches. llvm-svn: 212197
* R600: Move mul combine to separate functionMatt Arsenault2014-06-302-28/+35
| | | | llvm-svn: 212052
* R600: Remove unused declarations leftover from AMDILMatt Arsenault2014-06-301-8/+0
| | | | llvm-svn: 212051
* Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to ↵Craig Topper2014-06-291-3/+2
| | | | | | simplify some code. llvm-svn: 211993
* R600: Move trivial getters into header, use initializer listMatt Arsenault2014-06-272-95/+82
| | | | llvm-svn: 211917
OpenPOWER on IntegriCloud