summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* R600/SI: Remove SI_ADDR64_RSRCMatt Arsenault2014-11-054-44/+62
| | | | llvm-svn: 221382
* [NVPTX] Add NVPTXLowerStructArgs passJustin Holewinski2014-11-054-0/+154
| | | | | | | | | | | | | | | | | | | | | | | This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x *byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x * %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. llvm-svn: 221377
* IR: MDNode => Value: NamedMDNode::getOperator()Duncan P. N. Exon Smith2014-11-052-2/+2
| | | | | | | | | | | | | Change `NamedMDNode::getOperator()` from returning `MDNode *` to returning `Value *`. To reduce boilerplate at some call sites, add a `getOperatorAsMDNode()` for named metadata that's expected to only return `MDNode` -- for now, that's everything, but debug node named metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change. This is part of PR21433. Note that there's a follow-up patch to clang for the API change. llvm-svn: 221375
* [ARM] Remove more dead code.Tilmann Scheller2014-11-051-4/+0
| | | | | | Dead code identified by the Clang static analyzer. llvm-svn: 221372
* ps][microMIPS] Implement CodeGen support for ANDI16 instructionZoran Jovanovic2014-11-052-2/+13
| | | | llvm-svn: 221371
* [Hexagon] [NFC] Alphabetizing cmake files.Colin LeMahieu2014-11-051-6/+6
| | | | llvm-svn: 221370
* ps][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructionsZoran Jovanovic2014-11-052-9/+22
| | | | llvm-svn: 221369
* [ARM] Remove another redundant assignment.Tilmann Scheller2014-11-051-1/+0
| | | | | | Found by the Clang static analyzer. llvm-svn: 221368
* [mips][microMIPS] Implement ANDI16 instructionZoran Jovanovic2014-11-055-0/+64
| | | | llvm-svn: 221367
* [ARM] Remove redundant assignment.Tilmann Scheller2014-11-051-1/+0
| | | | | | Found by the Clang static analyzer. llvm-svn: 221366
* [ARM] Remove dead code identified by the Clang static analyzer.Tilmann Scheller2014-11-051-2/+0
| | | | llvm-svn: 221358
* [mips][microMIPS] Mark symbols as microMIPS if necessaryZoran Jovanovic2014-11-052-0/+52
| | | | | | Differential Revision: http://reviews.llvm.org/D6039 llvm-svn: 221355
* Reverted revisions 221351, 221352 and 221353.Zoran Jovanovic2014-11-056-99/+11
| | | | llvm-svn: 221354
* [mips][microMIPS] Implement CodeGen support for ANDI16 instructionZoran Jovanovic2014-11-052-2/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D5797 llvm-svn: 221353
* [mips][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructionsZoran Jovanovic2014-11-052-9/+22
| | | | | | Differential Revision: http://reviews.llvm.org/D5933 llvm-svn: 221352
* [mips][microMIPS] Implement ANDI16 instructionZoran Jovanovic2014-11-055-0/+64
| | | | | | Differential Revision: http://reviews.llvm.org/D5163 llvm-svn: 221351
* R600/SI: Change all instruction assembly names to lowercase.Tom Stellard2014-11-052-859/+859
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This matches the format produced by the AMD proprietary driver. //==================================================================// // Shell script for converting .ll test cases: (Pass the .ll files you want to convert to this script as arguments). //==================================================================// ; This was necessary on my system so that A-Z in sed would match only ; upper case. I'm not sure why. export LC_ALL='C' TEST_FILES="$*" MATCHES=`grep -v Patterns SIInstructions.td | grep -o '"[A-Z0-9_]\+["e]' | grep -o '[A-Z0-9_]\+' | sort -r` for f in $TEST_FILES; do # Check that there are SI tests: grep -q -e 'verde' -e 'bonaire' -e 'SI' -e 'tahiti' $f if [ $? -eq 0 ]; then for match in $MATCHES; do sed -i -e "s/\([ :]$match\)/\L\1/" $f done # Try to get check lines with partial instruction names sed -i 's/\(;[ ]*SI[A-Z\\-]*: \)\([A-Z_0-9]\+\)/\1\L\2/' $f fi done sed -i -e 's/bb0_1/BB0_1/g' ../../../test/CodeGen/R600/infinite-loop.ll sed -i -e 's/SI-NOT: bfe/SI-NOT: {{[^@]}}bfe/g'../../../test/CodeGen/R600/llvm.AMDGPU.bfe.*32.ll ../../../test/CodeGen/R600/sext-in-reg.ll sed -i -e 's/exp_IEEE/EXP_IEEE/g' ../../../test/CodeGen/R600/llvm.exp2.ll sed -i -e 's/numVgprs/NumVgprs/g' ../../../test/CodeGen/R600/register-count-comments.ll sed -i 's/\(; CHECK[-NOT]*: \)\([A-Z_0-9]\+\)/\1\L\2/' ../../../test/CodeGen/R600/select64.ll ../../../test/CodeGen/R600/sgpr-copy.ll //==================================================================// // Shell script for converting .td files (run this last) //==================================================================// export LC_ALL='C' sed -i -e '/Patterns/!s/\("[A-Z0-9_]\+[ "e]\)/\L\1/g' SIInstructions.td sed -i -e 's/"EXP/"exp/g' SIInstrInfo.td llvm-svn: 221350
* [X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks.Andrea Di Biagio2014-11-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. llvm-svn: 221343
* [ARM] Honor FeatureD16 in the assembler and disassemblerOliver Stannard2014-11-052-1/+12
| | | | | | | | | | | | | | | Some ARM FPUs only have 16 double-precision registers, rather than the normal 32. LLVM represents this with the D16 target feature. This is currently used by CodeGen to avoid using high registers when they are not available, but the assembler and disassembler do not. I fix this in the assmebler and disassembler rather than the InstrInfo.td files, as the latter would require a large number of changes everywhere one of the floating-point instructions is referenced in the backend. This solution is similar to the one used for co-processor numbers and MSR masks. llvm-svn: 221341
* ARM: try to add extra CS-register whenever stack alignment >= 8.Tim Northover2014-11-051-1/+1
| | | | | | | | | | We currently try to push an even number of registers to preserve 8-byte alignment during a function's prologue, but only when the stack alignment is prcisely 8. Many of the reasons for doing it apply also when that alignment > 8 (the extra store is often free, and can save another stack adjustment, though less frequently for 16-byte stack alignment). llvm-svn: 221321
* ARM/Dwarf: correctly align stack before callee-saved VPRsTim Northover2014-11-052-5/+26
| | | | | | | | | | | | | | | | | | We were making an attempt to do this by adding an extra callee-saved GPR (so that there was an even number in the list), but when that failed we went ahead and pushed anyway. This had a couple of potential issues: + The .cfi directives we emit misplaced dN because they were based on PrologEpilogInserter's calculation. + Unaligned stores can be less efficient. + Unaligned stores can actually fault (likely only an issue in niche cases, but possible). This adds a final explicit stack adjustment if all other options fail, so that the actual locations of the registers match up with where they should be. llvm-svn: 221320
* [X86][SSE] Enable commutation for SSE immediate blend instructionsSimon Pilgrim2014-11-042-28/+77
| | | | | | | | | | Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask. This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests). Differential Revision: http://reviews.llvm.org/D6015 llvm-svn: 221313
* [AArch64] Use the correct register class for ORR.Juergen Ributzka2014-11-041-1/+1
| | | | | | | | | While fixing up the register classes in the machine combiner in a previous commit I missed one. This fixes the last one and adds a test case. llvm-svn: 221308
* Revert "[mips] Add names and tests for the hardware registers"Rafael Espindola2014-11-042-39/+2
| | | | | | | | | | | | | This reverts commit r221299. The tests LLVM :: MC/Disassembler/Mips/mips32.txt LLVM :: MC/Disassembler/Mips/mips32_le.txt were failing. llvm-svn: 221307
* [mips] Move COP2 & COP3 load/store instructions from MipsInstrFPU.td to ↵Vasileios Kalintiris2014-11-042-56/+54
| | | | | | | | | | | | | | MipsInstrInfo.td. NFC. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5843 llvm-svn: 221300
* [mips] Add names and tests for the hardware registersVasileios Kalintiris2014-11-042-2/+39
| | | | | | | | | | | | Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5763 llvm-svn: 221299
* [X86] Add 'FeatureSlowSHLD' to cpu 'bdver3'. Also explicit set FeatureAVX ↵Andrea Di Biagio2014-11-041-8/+11
| | | | | | | | | | | | | | | | | and FeatureSSE4A for all the bdver* cpus. This patch adds 'FeatureSlowSHLD' to 'bdver3'. According to the official AMD optimization guide for amdfam15: "Using alternative code in place of SHLD achieves lower overall latency and requires fewer execution resources. The 32-bit and 64-bit forms of ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath instructions, while SHLD is a VectorPath instruction." This patch also explicitly sets feature AVX and SSE4A for all the bdver* cpus. This part of the patch is a non-functional change and it is mainly done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A). llvm-svn: 221296
* R600/SI: Rename div_scale dest operands to match documentationMatt Arsenault2014-11-041-2/+2
| | | | llvm-svn: 221291
* AArch64: Pattern match integer vector abs like we do on ARM.Benjamin Kramer2014-11-041-0/+22
| | | | | | This kind of pattern is emitted by the loop vectorizer. llvm-svn: 221289
* [mips] Improve support for the .set mips16/nomips16 assembler directives.Toma Tabacu2014-11-041-6/+22
| | | | | | | | | | | | | | | | | | | Summary: Appropriately set/clear the FeatureBit for Mips16 when these assembler directives are used and also emit ".set nomips16" (previously, only ".set mips16" was being emitted). These improvements allow for better testing of the .cpload/.cprestore assembler directives (which are not supposed to work when Mips16 is enabled). Test Plan: The test is bare-bones because there are no MC tests for Mips16 instructions (there's only one, which checks that the Mips16 ELF header flag gets set), and that suggests to me that it has not been implemented yet in the IAS. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5462 llvm-svn: 221277
* R600/LLVMBuild.txt: Add TransformUtils.NAKAMURA Takumi2014-11-041-1/+1
| | | | llvm-svn: 221228
* [Hexagon] Reverting 220584 to address ASAN errors.Colin LeMahieu2014-11-0419-241/+233
| | | | llvm-svn: 221210
* Rename variables to conform to llvm coding standards.Akira Hatanaka2014-11-031-28/+28
| | | | | | Differential Revision: http://reviews.llvm.org/D6062 llvm-svn: 221204
* [AArch64] Make function processLogicalImmediate more efficient. NFC.Akira Hatanaka2014-11-031-47/+42
| | | | llvm-svn: 221199
* [X86] Add debug print name for X86ISD::[US]MUL8. NFC-ish.Ahmed Bougacha2014-11-031-0/+2
| | | | | | The opcodes were added in r220516, but I forgot to add the print names. llvm-svn: 221185
* [ARM, inline-asm] Fix ARMTargetLowering::getRegForInlineAsmConstraint to returnAkira Hatanaka2014-11-031-0/+2
| | | | | | | | | | | | register class tGPRRegClass if the target is thumb1. This commit fixes a crash that occurs during register allocation which was triggered when a virtual register defined by an inline-asm instruction had to be spilled. rdar://problem/18740489 llvm-svn: 221178
* [X86] 8bit divrem: Improve codegen for AH register extraction.Ahmed Bougacha2014-11-034-28/+87
| | | | | | | | | | | | | | | | | | | | | | | | For 8-bit divrems where the remainder is used, we used to generate: divb %sil shrw $8, %ax movzbl %al, %eax That was to avoid an H-reg access, which is problematic mainly because it isn't possible in REX-prefixed instructions. This patch optimizes that to: divb %sil movzbl %ah, %eax To do that, we explicitly extend AH, and extract the L-subreg in the resulting register. The extension is done using the NOREX variants of MOVZX. To support signed operations, MOVSX_NOREX is also added. Further, this introduces a new SDNode type, [us]divrem_ext_hreg, which is then lowered to a sequence containing a single zext (rather than 2). Differential Revision: http://reviews.llvm.org/D6064 llvm-svn: 221176
* Reapply: R600: Make sure to inline all internal functionsTom Stellard2014-11-034-0/+82
| | | | | | | | Function calls aren't supported yet. This was reverted due to build breakages, which should be fixed now. llvm-svn: 221173
* IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()Duncan P. N. Exon Smith2014-11-031-1/+2
| | | | | | | Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of `MDNode` to one of `Value`. Part of PR21433. llvm-svn: 221167
* Remove the cortex-a9-mp CPU.Charlie Turner2014-11-031-5/+1
| | | | | | | | | | | | | | | | | | This CPU definition is redundant. The Cortex-A9 is defined as supporting multiprocessing extensions. Remove its definition and update appropriate tests. LLVM defines both a cortex-a9 CPU and a cortex-a9-mp CPU. The only difference between the two CPU definitions in ARM.td is that cortex-a9-mp contains the feature FeatureMP for multiprocessing extensions. This is redundant since the Cortex-A9 is defined as having multiprocessing extensions in the TRMs. armcc also defines the Cortex-A9 as having multiprocessing extensions by default. Change-Id: Ifcadaa6c322be0a33d9d2a39cfdd7da1d75981a7 llvm-svn: 221166
* [AArch64] Fix miscompile of comparison with 0xffffffffffffffffOliver Stannard2014-11-031-4/+4
| | | | | | | Some literals in the AArch64 backend had 15 'f's rather than 16, causing comparisons with a constant 0xffffffffffffffff to be miscompiled. llvm-svn: 221157
* Handle ctor/init_array initialization.Sid Manning2014-11-031-1/+1
| | | | | | | | | Hexagon was not calling InitializeELF and could not select between ctors and init_array. Phabricator revision: http://reviews.llvm.org/D6061 llvm-svn: 221156
* [mips] Remove unused prototype and variable. NFC.Daniel Sanders2014-11-032-5/+0
| | | | llvm-svn: 221146
* R600: Don't unnecessarily repeat the register classMatt Arsenault2014-11-021-5/+5
| | | | llvm-svn: 221119
* R600/SI: Use REG_SEQUENCE instead of INSERT_SUBREGsMatt Arsenault2014-11-024-47/+43
| | | | llvm-svn: 221118
* Support REG_SEQUENCE in tablegen.Matt Arsenault2014-11-021-3/+3
| | | | | | | | | The problem is mostly that variadic output instruction aren't handled, so it is rejected for having an inconsistent number of operands, and then the right number of operands isn't emitted. llvm-svn: 221117
* Re-commit r221056 and others with fix, "[mips] Move F128 argument handling ↵Daniel Sanders2014-11-0210-175/+255
| | | | | | | | | into MipsCCState as we did for returns. NFC." sret arguments can never originate from an f128 argument so we detect sret arguments and push false into OriginalArgWasF128. llvm-svn: 221102
* Revert r221056 and others, "[mips] Move F128 argument handling into ↵NAKAMURA Takumi2014-11-0210-246/+175
| | | | | | | | | | | | | MipsCCState as we did for returns. NFC." r221056 "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC." r221058 "[mips] Fix unused variable warning introduced in r221056" r221059 "[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC." r221061 "Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC." It cuased an undefined behavior in LLVM :: CodeGen/Mips/return-vector.ll. llvm-svn: 221081
* Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC.Daniel Sanders2014-11-012-5/+5
| | | | | | | | | | | | Reviewers: rnk Reviewed By: rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D5978 llvm-svn: 221061
* [mips] Move all ByVal handling into CCState and tablegen-erated code. NFC.Daniel Sanders2014-11-0110-148/+157
| | | | | | | | | | | | | | | | | Summary: CCState already contains a byval implementation that is very similar to the Mips custom code. This patch merges the custom code into the existing common code and tablegen-erated code. Reviewers: vmedic Reviewed By: vmedic Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D5977 llvm-svn: 221059
OpenPOWER on IntegriCloud