summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/NVPTX
Commit message (Collapse)AuthorAgeFilesLines
...
* [NVPTX] Add missing isel patterns for 64-bit atomicsJustin Holewinski2014-06-271-0/+141
| | | | llvm-svn: 211933
* [NVPTX] Add isel patterns for bit-field extract (bfe)Justin Holewinski2014-06-271-0/+32
| | | | llvm-svn: 211932
* [NVPTX] Add support for isspacep instructionJustin Holewinski2014-06-271-0/+35
| | | | llvm-svn: 211931
* [NVPTX] Add support for envreg readsJustin Holewinski2014-06-271-0/+139
| | | | llvm-svn: 211930
* [NVPTX] Emit .weak when linkage is not external, internal, or privateJustin Holewinski2014-06-271-0/+12
| | | | llvm-svn: 211926
* Canonicalize addrspacecast ConstExpr between different pointer typesJingyue Wu2014-06-151-4/+4
| | | | | | | | | | | | | | | | | | As a follow-up to r210375 which canonicalizes addrspacecast instructions, this patch canonicalizes addrspacecast constant expressions. Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast cosntant expressions, this patch is also a step towards having the frontend emit canonicalized addrspacecasts. Piggyback a minor refactor in InstCombineCasts.cpp Update three affected tests in addrspacecast-alias.ll, access-non-generic.ll and constant-fold-gep.ll and added one new test in constant-fold-address-space-pointer.ll llvm-svn: 211004
* Reduce verbiage of lit.local.cfg filesAlp Toker2014-06-091-2/+1
| | | | | | We can just split targets_to_build in one place and make it immutable. llvm-svn: 210496
* Allow aliases to be unnamed_addr.Rafael Espindola2014-06-061-1/+1
| | | | | | | | | | | | | | | | | | Alias with unnamed_addr were in a strange state. It is stored in GlobalValue, the language reference talks about "unnamed_addr aliases" but the verifier was rejecting them. It seems natural to allow unnamed_addr in aliases: * It is a property of how it is accessed, not of the data itself. * It is perfectly possible to write code that depends on the address of an alias. This patch then makes unname_addr legal for aliases. One side effect is that the syntax changes for a corner case: In globals, unnamed_addr is now printed before the address space. llvm-svn: 210302
* Fix the test: DCE optimized away everything.Eli Bendersky2014-04-211-9/+9
| | | | | | | | Use volatile store to protect the generated PTX from DCE. Patch by Jingyue Wu. llvm-svn: 206763
* [NVPTX] Add preliminary intrinsics and codegen support for textures/surfacesJustin Holewinski2014-04-093-0/+56
| | | | | | This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later). llvm-svn: 205907
* [NVPTX] Add support for addrspacecast in global variable initializers, ↵Justin Holewinski2014-04-091-0/+9
| | | | | | including emitting generic() when casting to address space 0. llvm-svn: 205906
* Optimize away unnecessary address casts.Eli Bendersky2014-04-032-2/+93
| | | | | | | | | Removes unnecessary casts from non-generic address spaces to the generic address space for certain code patterns. Patch by Jingyue Wu. llvm-svn: 205571
* Fix for PR19099 - NVPTX produces invalid symbol names.Eli Bendersky2014-03-311-2/+25
| | | | | | | | This is a more thorough fix for the issue than r203483. An IR pass will run before NVPTX codegen to make sure there are no invalid symbol names that can't be consumed by the ptxas assembler. llvm-svn: 205212
* Add test to test/CodeGen/NVPTX for "alloca buffer" arguments.Eli Bendersky2014-03-241-0/+66
| | | | | | Make sure such IR gets properly lowered to PTX. llvm-svn: 204624
* [NVPTX] Add isel patterns for addrspacecastJustin Holewinski2014-03-241-0/+99
| | | | llvm-svn: 204600
* Expose "noduplicate" attribute as a property for intrinsics.Eli Bendersky2014-03-181-0/+74
| | | | | | | | | | | | The "noduplicate" function attribute exists to prevent certain optimizations from duplicating calls to the function. This is important on platforms where certain function call duplications are unsafe (for example execution barriers for CUDA and OpenCL). This patch makes it possible to specify intrinsics as "noduplicate" and translates that to the appropriate function attribute. llvm-svn: 204200
* Remove the linker_private and linker_private_weak linkages.Rafael Espindola2014-03-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by *name*. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). llvm-svn: 203866
* Followup to r203483 - add test.Eli Bendersky2014-03-101-0/+8
| | | | | | [forgot to 'svn add' before committing r203483] llvm-svn: 203485
* [NVPTX] Fix emitting aggregate parametersGautam Chakrabarti2014-01-281-0/+20
| | | | | | | | The code was missing the case for aggregate parameters and hence was emitting them as .b0 type. Also fixed a couple of comments. llvm-svn: 200325
* [NVPTX] Add missing patterns for div.approx with immediate denominatorJustin Holewinski2014-01-211-0/+8
| | | | llvm-svn: 199746
* Fix non-deterministic SDNodeOrder-dependent codegenNico Rieck2014-01-121-3/+3
| | | | | | | Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050
* [NVPTX] Fix off-by-one error when creating the VT list for an SDNodeJustin Holewinski2013-12-051-0/+10
| | | | llvm-svn: 196503
* [NVPTX] Fix handling of indirect callsJustin Holewinski2013-11-151-0/+10
| | | | | | Using a special machine node is cleaner than an InlineAsm node, and fixes an assertion failure in InstrEmitter llvm-svn: 194810
* [NVPTX] Properly handle bitcast ConstantExpr when checking for the alignment ↵Justin Holewinski2013-11-111-0/+26
| | | | | | of function parameters llvm-svn: 194410
* [NVPTX] Fix logic error in loading vector parameters of more than 4 componentsJustin Holewinski2013-11-111-0/+13
| | | | llvm-svn: 194409
* [NVPTX] Switch from StrongPHIElimination to PHIElimination in ↵Justin Holewinski2013-10-111-0/+38
| | | | | | | | NVPTXTargetMachine, and add some missing optimization passes to addOptimizedRegAlloc Fixes PR17529 llvm-svn: 192445
* Make AsmPrinter::emitImplicitDef a virtual method so targets can emit custom ↵Justin Holewinski2013-10-111-0/+9
| | | | | | | | | | | | | comments for implicit defs For NVPTX, this fixes a crash where the emitImplicitDef implementation was expecting physical registers, while NVPTX uses virtual registers (with a couple of exceptions). Now, the implicit def comment will be emitted as a true PTX register name. Other targets can use this to customize the output of implicit def comments. Fixes PR17519 llvm-svn: 192444
* [NVPTX] Make constant vector test case endian-independentJustin Holewinski2013-09-191-3/+2
| | | | llvm-svn: 190998
* [NVPTX] Support constant vector globalsJustin Holewinski2013-09-191-0/+7
| | | | llvm-svn: 190997
* [NVPTX] Re-enable assembly printing support for inline assemblyJustin Holewinski2013-08-241-0/+9
| | | | | | This support was removed by accident during the MC conversion llvm-svn: 189160
* [tests] Cleanup initialization of test suffixes.Daniel Dunbar2013-08-161-2/+0
| | | | | | | | | | | | | | | | | - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513
* [NVPTX] Add missing patterns for i1 [s,u]int_to_fpJustin Holewinski2013-08-061-0/+37
| | | | llvm-svn: 187800
* [NVPTX] Fix bug in stack code generation causes by MC conversionJustin Holewinski2013-08-061-0/+18
| | | | | | | We do use a very small set of physical registers, so account for them in the virtual register encoding between MachineInstr and MC llvm-svn: 187799
* [NVPTX] Start conversion to MC infrastructureJustin Holewinski2013-08-061-0/+18
| | | | | | | | | This change converts the NVPTX target to use the MC infrastructure instead of directly emitting MachineInstr instances. This brings the target more up-to-date with LLVM TOT, and should fix PR15175 and PR15958 (libNVPTXInstPrinter is empty) as a side-effect. llvm-svn: 187798
* Add a target legalize hook for SplitVectorOperand (again)Justin Holewinski2013-07-261-0/+30
| | | | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 Attempt to fix the buildbots by making the X86 test I just added platform independent llvm-svn: 187202
* Revert "Add a target legalize hook for SplitVectorOperand"Rafael Espindola2013-07-261-30/+0
| | | | | | | | | | This reverts commit 187198. It broke the bots. The soft float test probably needs a -triple because of name differences. On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of "vroundss $1, %xmm0, %xmm0, %xmm0". llvm-svn: 187201
* Add a target legalize hook for SplitVectorOperandJustin Holewinski2013-07-261-0/+30
| | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 llvm-svn: 187198
* [NVPTX] Use approximate FP ops when unsafe-fp-math is used, and appendJustin Holewinski2013-07-221-0/+43
| | | | | | .ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820
* Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier ↵Stephen Lin2013-07-135-5/+5
| | | | | | | | | | debugging. No functionality change and all tests pass after conversion. This was done with the following sed invocation to catch label lines demarking function boundaries: sed -i '' "s/^;\( *\)\([A-Z0-9_]*\):\( *\)test\([A-Za-z0-9_-]*\):\( *\)$/;\1\2-LABEL:\3test\4:\5/g" test/CodeGen/*/*.ll which was written conservatively to avoid false positives rather than false negatives. I scanned through all the changes and everything looks correct. llvm-svn: 186258
* [NVPTX] Add support for module-scope inline asmJustin Holewinski2013-07-011-0/+10
| | | | | | | Since we were explicitly not calling AsmPrinter::doInitialization, any module-scope inline asm was not being printed. llvm-svn: 185336
* [NVPTX] 64-bit ADDC/ADDE are not legalJustin Holewinski2013-07-011-0/+19
| | | | llvm-svn: 185333
* [NVPTX] Fix vector loads from parameters that span multiple loads, and fix ↵Justin Holewinski2013-07-011-0/+13
| | | | | | some typos llvm-svn: 185332
* [NVPTX] Handle signext/zeroext attributes properlyJustin Holewinski2013-07-011-0/+16
| | | | | | | | Fix a case where we were incorrectly sign-extending a value when we should have been zero-extending the value. Also change some SIGN_EXTEND to ANY_EXTEND because we really dont care and may have more opportunity to fold subexpressions llvm-svn: 185331
* [NVPTX] Add support for native SIGN_EXTEND_INREG where availableJustin Holewinski2013-07-011-0/+111
| | | | llvm-svn: 185330
* [NVPTX] Add isel patterns for [reg+offset] form of ldg/ldu.Justin Holewinski2013-07-011-0/+21
| | | | llvm-svn: 185329
* [NVPTX] Make sure we zero out high-order 24 bits for 8-bit load into 32-bit ↵Justin Holewinski2013-07-011-0/+14
| | | | | | value llvm-svn: 185328
* [NVPTX] Add (1.0 / sqrt(x)) => rsqrt(x) generation when allowable by FP flagsJustin Holewinski2013-06-281-0/+13
| | | | llvm-svn: 185178
* [NVPTX] Calling conventions fixJustin Holewinski2013-06-285-40/+63
| | | | | | | | Fix ABI handling for function returning bool -- use st.param.b32 to return the value and use ld.param.b32 in caller to load the return value. llvm-svn: 185177
* [NVPTX] Add support for cttz/ctlz/ctpopJustin Holewinski2013-06-283-0/+114
| | | | llvm-svn: 185176
* [NVPTX] Clean up comparison/select/convert patterns and factor out PTX ↵Justin Holewinski2013-06-281-4/+4
| | | | | | | | instructions from their patterns Test case is no breakage llvm-svn: 185175
OpenPOWER on IntegriCloud