summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* R600/SI: Remove SI_ADDR64_RSRCMatt Arsenault2014-11-054-44/+62
| | | | llvm-svn: 221382
* [NVPTX] Add NVPTXLowerStructArgs passJustin Holewinski2014-11-054-0/+154
| | | | | | | | | | | | | | | | | | | | | | | This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x *byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x * %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. llvm-svn: 221377
* IR: MDNode => Value: NamedMDNode::getOperator()Duncan P. N. Exon Smith2014-11-0515-35/+37
| | | | | | | | | | | | | Change `NamedMDNode::getOperator()` from returning `MDNode *` to returning `Value *`. To reduce boilerplate at some call sites, add a `getOperatorAsMDNode()` for named metadata that's expected to only return `MDNode` -- for now, that's everything, but debug node named metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change. This is part of PR21433. Note that there's a follow-up patch to clang for the API change. llvm-svn: 221375
* remove extra breaks; NFCSanjay Patel2014-11-051-4/+1
| | | | llvm-svn: 221374
* IR: MDNode => Value: AsmWriter SlotTracker APIDuncan P. N. Exon Smith2014-11-051-8/+9
| | | | | | | | Change `SlotTracker::CreateMetadataSlot()` and `SlotTracker::getMetadataSlot()` to use `Value` instead of `MDNode`. Part of PR21433. llvm-svn: 221373
* [ARM] Remove more dead code.Tilmann Scheller2014-11-051-4/+0
| | | | | | Dead code identified by the Clang static analyzer. llvm-svn: 221372
* ps][microMIPS] Implement CodeGen support for ANDI16 instructionZoran Jovanovic2014-11-052-2/+13
| | | | llvm-svn: 221371
* [Hexagon] [NFC] Alphabetizing cmake files.Colin LeMahieu2014-11-051-6/+6
| | | | llvm-svn: 221370
* ps][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructionsZoran Jovanovic2014-11-052-9/+22
| | | | llvm-svn: 221369
* [ARM] Remove another redundant assignment.Tilmann Scheller2014-11-051-1/+0
| | | | | | Found by the Clang static analyzer. llvm-svn: 221368
* [mips][microMIPS] Implement ANDI16 instructionZoran Jovanovic2014-11-055-0/+64
| | | | llvm-svn: 221367
* [ARM] Remove redundant assignment.Tilmann Scheller2014-11-051-1/+0
| | | | | | Found by the Clang static analyzer. llvm-svn: 221366
* [dfsan] Abort at runtime on indirect calls to uninstrumented vararg functions.Peter Collingbourne2014-11-051-10/+33
| | | | | | | | | | | | | We currently have no infrastructure to support these correctly. This is accomplished by generating a call to a runtime library function that aborts at runtime in place of the regular wrapper for such functions. Direct calls are rewritten in the usual way during traversal of the caller's IR. We also remove the "split-stack" attribute from such wrappers, as the code generator cannot currently handle split-stack vararg functions. llvm-svn: 221360
* IR: MDNode => Value: NamedMDNode::addOperand()Duncan P. N. Exon Smith2014-11-051-1/+2
| | | | | | | Change `NamedMDNode::addOperand()` to take a `Value *` instead of an `MDNode *`. This is part of PR21433. llvm-svn: 221359
* [ARM] Remove dead code identified by the Clang static analyzer.Tilmann Scheller2014-11-051-2/+0
| | | | llvm-svn: 221358
* [mips][microMIPS] Mark symbols as microMIPS if necessaryZoran Jovanovic2014-11-052-0/+52
| | | | | | Differential Revision: http://reviews.llvm.org/D6039 llvm-svn: 221355
* Reverted revisions 221351, 221352 and 221353.Zoran Jovanovic2014-11-056-99/+11
| | | | llvm-svn: 221354
* [mips][microMIPS] Implement CodeGen support for ANDI16 instructionZoran Jovanovic2014-11-052-2/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D5797 llvm-svn: 221353
* [mips][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructionsZoran Jovanovic2014-11-052-9/+22
| | | | | | Differential Revision: http://reviews.llvm.org/D5933 llvm-svn: 221352
* [mips][microMIPS] Implement ANDI16 instructionZoran Jovanovic2014-11-055-0/+64
| | | | | | Differential Revision: http://reviews.llvm.org/D5163 llvm-svn: 221351
* R600/SI: Change all instruction assembly names to lowercase.Tom Stellard2014-11-052-859/+859
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This matches the format produced by the AMD proprietary driver. //==================================================================// // Shell script for converting .ll test cases: (Pass the .ll files you want to convert to this script as arguments). //==================================================================// ; This was necessary on my system so that A-Z in sed would match only ; upper case. I'm not sure why. export LC_ALL='C' TEST_FILES="$*" MATCHES=`grep -v Patterns SIInstructions.td | grep -o '"[A-Z0-9_]\+["e]' | grep -o '[A-Z0-9_]\+' | sort -r` for f in $TEST_FILES; do # Check that there are SI tests: grep -q -e 'verde' -e 'bonaire' -e 'SI' -e 'tahiti' $f if [ $? -eq 0 ]; then for match in $MATCHES; do sed -i -e "s/\([ :]$match\)/\L\1/" $f done # Try to get check lines with partial instruction names sed -i 's/\(;[ ]*SI[A-Z\\-]*: \)\([A-Z_0-9]\+\)/\1\L\2/' $f fi done sed -i -e 's/bb0_1/BB0_1/g' ../../../test/CodeGen/R600/infinite-loop.ll sed -i -e 's/SI-NOT: bfe/SI-NOT: {{[^@]}}bfe/g'../../../test/CodeGen/R600/llvm.AMDGPU.bfe.*32.ll ../../../test/CodeGen/R600/sext-in-reg.ll sed -i -e 's/exp_IEEE/EXP_IEEE/g' ../../../test/CodeGen/R600/llvm.exp2.ll sed -i -e 's/numVgprs/NumVgprs/g' ../../../test/CodeGen/R600/register-count-comments.ll sed -i 's/\(; CHECK[-NOT]*: \)\([A-Z_0-9]\+\)/\1\L\2/' ../../../test/CodeGen/R600/select64.ll ../../../test/CodeGen/R600/sgpr-copy.ll //==================================================================// // Shell script for converting .td files (run this last) //==================================================================// export LC_ALL='C' sed -i -e '/Patterns/!s/\("[A-Z0-9_]\+[ "e]\)/\L\1/g' SIInstructions.td sed -i -e 's/"EXP/"exp/g' SIInstrInfo.td llvm-svn: 221350
* [X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks.Andrea Di Biagio2014-11-052-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. llvm-svn: 221343
* [ARM] Honor FeatureD16 in the assembler and disassemblerOliver Stannard2014-11-052-1/+12
| | | | | | | | | | | | | | | Some ARM FPUs only have 16 double-precision registers, rather than the normal 32. LLVM represents this with the D16 target feature. This is currently used by CodeGen to avoid using high registers when they are not available, but the assembler and disassembler do not. I fix this in the assmebler and disassembler rather than the InstrInfo.td files, as the latter would require a large number of changes everywhere one of the floating-point instructions is referenced in the backend. This solution is similar to the one used for co-processor numbers and MSR masks. llvm-svn: 221341
* Improve logic that decides if its profitable to commute when some of the ↵Craig Topper2014-11-051-4/+15
| | | | | | virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves. llvm-svn: 221336
* llvm-readobj: Add support for dumping the DOS header in PE filesDavid Majnemer2014-11-052-16/+17
| | | | llvm-svn: 221333
* Revert 220932.Jiangning Liu2014-11-055-49/+4
| | | | | | | | | Commit 220932 caused crash when building clang-tblgen on aarch64 debian target, so it's blocking all daily tests. The std::call_once implementation in pthread has bug for aarch64 debian. llvm-svn: 221331
* IR: Metadata: Remove unnecessary dyn_castDuncan P. N. Exon Smith2014-11-051-1/+1
| | | | llvm-svn: 221328
* InstSimplify: Exact shifts of X by Y are X if X has the lsb setDavid Majnemer2014-11-051-11/+31
| | | | | | | | | Exact shifts may not shift out any non-zero bits. Use computeKnownBits to determine when this occurs and just return the left hand side. This fixes PR21477. llvm-svn: 221325
* ARM: try to add extra CS-register whenever stack alignment >= 8.Tim Northover2014-11-051-1/+1
| | | | | | | | | | We currently try to push an even number of registers to preserve 8-byte alignment during a function's prologue, but only when the stack alignment is prcisely 8. Many of the reasons for doing it apply also when that alignment > 8 (the extra store is often free, and can save another stack adjustment, though less frequently for 16-byte stack alignment). llvm-svn: 221321
* ARM/Dwarf: correctly align stack before callee-saved VPRsTim Northover2014-11-052-5/+26
| | | | | | | | | | | | | | | | | | We were making an attempt to do this by adding an extra callee-saved GPR (so that there was an even number in the list), but when that failed we went ahead and pushed anyway. This had a couple of potential issues: + The .cfi directives we emit misplaced dN because they were based on PrologEpilogInserter's calculation. + Unaligned stores can be less efficient. + Unaligned stores can actually fault (likely only an issue in niche cases, but possible). This adds a final explicit stack adjustment if all other options fail, so that the actual locations of the registers match up with where they should be. llvm-svn: 221320
* Analysis: Make isSafeToSpeculativelyExecute fire less for dividesDavid Majnemer2014-11-041-15/+23
| | | | | | | | | | | | | Divides and remainder operations do not behave like other operations when they are given poison: they turn into undefined behavior. It's really hard to know if the operands going into a div are or are not poison. Because of this, we should only choose to speculate if there are constant operands which we can easily reason about. This fixes PR21412. llvm-svn: 221318
* Revert "[Reassociate] Canonicalize negative constants out of expressions."Reid Kleckner2014-11-041-101/+42
| | | | | | | | | | | | This reverts commit r221171. It performs this invalid transformation: - %div.i = urem i64 -1, %add - %sub.i = sub i64 -2, %div.i + %div.i = urem i64 1, %add + %sub.i1 = add i64 %div.i, -2 llvm-svn: 221317
* [X86][SSE] Enable commutation for SSE immediate blend instructionsSimon Pilgrim2014-11-042-28/+77
| | | | | | | | | | Patch to allow (v)blendps, (v)blendpd, (v)pblendw and vpblendd instructions to be commuted - swaps the src registers and inverts the blend mask. This is primarily to improve memory folding (see new tests), but it also improves the quality of shuffles (see modified tests). Differential Revision: http://reviews.llvm.org/D6015 llvm-svn: 221313
* Revert earlier change removing setPreservesCFG from instcombine (r221223) andMark Heffernan2014-11-042-4/+3
| | | | | | | | | | change LoopSimplifyPass to be !isCFGOnly. The motivation for the earlier patch (r221223) was that LoopSimplify is not preserved by instcombine though setPreservesCFG indicates that it is. This change fixes the issue by making setPreservesCFG no longer imply LoopSimplifyPass, and is therefore less invasive. llvm-svn: 221311
* [AArch64] Use the correct register class for ORR.Juergen Ributzka2014-11-041-1/+1
| | | | | | | | | While fixing up the register classes in the machine combiner in a previous commit I missed one. This fixes the last one and adds a test case. llvm-svn: 221308
* Revert "[mips] Add names and tests for the hardware registers"Rafael Espindola2014-11-042-39/+2
| | | | | | | | | | | | | This reverts commit r221299. The tests LLVM :: MC/Disassembler/Mips/mips32.txt LLVM :: MC/Disassembler/Mips/mips32_le.txt were failing. llvm-svn: 221307
* Provide gmlt-like inline scope information in the skeleton CU to facilitate ↵David Blaikie2014-11-045-31/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | symbolication without needing the .dwo files Clang -gsplit-dwarf self-host -O0, binary increases by 0.0005%, -O2, binary increases by 25%. A large binary inside Google, split-dwarf, -O0, and other internal flags (GDB index, etc) increases by 1.8%, optimized build is 35%. The size impact may be somewhat greater in .o files (I haven't measured that much - since the linked executable -O0 numbers seemed low enough) due to relocations. These relocations could be removed if we taught the llvm-symbolizer to handle indexed addressing in the .o file (GDB can't cope with this just yet, but GDB won't be reading this info anyway). Also debug_ranges could be shared between .o and .dwo, though ideally debug_ranges would get a schema that could used index(+offset) addressing, and move to the .dwo file, then we'd be back to sharing addresses in the address pool again. But for now, these sizes seem small enough to go ahead with this. Verified that no other DW_TAGs are produced into the .o file other than subprograms and inlined_subroutines. llvm-svn: 221306
* Move cross-unit DIE caching to the DwarfFile level, so it doesn't interfere ↵David Blaikie2014-11-043-14/+14
| | | | | | with fission-gmlt data and produce skeleton<>full unit cross referencing. llvm-svn: 221305
* Don't produce relocations for a difference in a section with no symbols.Rafael Espindola2014-11-041-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | We were producing a relocation for ---------------- .section foo,bar La: Lb: .long La-Lb -------------- but not for --------------------- .section foo,bar zed: La: Lb: .long La-Lb ---------------- This patch handles the case where both fragments are part of the first atom in a section and there is no corresponding symbol to that atom. This fixes pr21328. llvm-svn: 221304
* [mips] Move COP2 & COP3 load/store instructions from MipsInstrFPU.td to ↵Vasileios Kalintiris2014-11-042-56/+54
| | | | | | | | | | | | | | MipsInstrInfo.td. NFC. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5843 llvm-svn: 221300
* [mips] Add names and tests for the hardware registersVasileios Kalintiris2014-11-042-2/+39
| | | | | | | | | | | | Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5763 llvm-svn: 221299
* [X86] Add 'FeatureSlowSHLD' to cpu 'bdver3'. Also explicit set FeatureAVX ↵Andrea Di Biagio2014-11-041-8/+11
| | | | | | | | | | | | | | | | | and FeatureSSE4A for all the bdver* cpus. This patch adds 'FeatureSlowSHLD' to 'bdver3'. According to the official AMD optimization guide for amdfam15: "Using alternative code in place of SHLD achieves lower overall latency and requires fewer execution resources. The 32-bit and 64-bit forms of ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath instructions, while SHLD is a VectorPath instruction." This patch also explicitly sets feature AVX and SSE4A for all the bdver* cpus. This part of the patch is a non-functional change and it is mainly done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A). llvm-svn: 221296
* [PBQP] Callee saved regs should have a higher cost than scratch regsArnaud A. de Grandmaison2014-11-041-0/+16
| | | | | | | | | | | Registers are not all equal. Some are not allocatable (infinite cost), some have to be preserved but can be used, and some others are just free to use. Ensure there is a cost hierarchy reflecting this fact, so that the allocator will favor scratch registers over callee-saved registers. llvm-svn: 221293
* [PBQP] Tweak spill costs and coalescing benefitsArnaud A. de Grandmaison2014-11-043-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves how the different costs (register, interference, spill and coalescing) relates together. The assumption is now that: - coalescing (or any other "side effect" of reg alloc) is negative, and instead of being derived from a spill cost, they use the block frequency info. - spill costs are in the [MinSpillCost:+inf( range - register or interference costs are in [0.0:MinSpillCost( or +inf The current MinSpillCost is set to 10.0, which is a random value high enough that the current constraint builders do not need to worry about when settings costs. It would however be worth adding a normalization step for register and interference costs as the last step in the constraint builder chain to ensure they are not greater than SpillMinCost (unless this has some sense for some architectures). This would work well with the current builder pipeline, where all costs are tweaked relatively to each others, but could grow above MinSpillCost if the pipeline is deep enough. The current heuristic is tuned to depend rather on the number of uses of a live interval rather than a density of uses, as used by the greedy allocator. This heuristic provides a few percent improvement on a number of benchmarks (eembc, spec, ...) and will definitely need to change once spill placement is implemented: the current spill placement is really ineficient, so making the cost proportionnal to the number of use is a clear win. llvm-svn: 221292
* R600/SI: Rename div_scale dest operands to match documentationMatt Arsenault2014-11-041-2/+2
| | | | llvm-svn: 221291
* AArch64: Pattern match integer vector abs like we do on ARM.Benjamin Kramer2014-11-041-0/+22
| | | | | | This kind of pattern is emitted by the loop vectorizer. llvm-svn: 221289
* [asan] [mips] changed ShadowOffset32 for systems having 16kb PageSize; patch ↵Kostya Serebryany2014-11-041-1/+1
| | | | | | by Kumar Sukhani llvm-svn: 221288
* InstSimplify: Fold a hasNoSignedWrap() call into a match() expressionDavid Majnemer2014-11-041-2/+1
| | | | | | No functionality change intended, it's just a little more concise. llvm-svn: 221281
* InstSimplify: Fold a hasNoUnsignedWrap() call into a match() expressionDavid Majnemer2014-11-041-2/+1
| | | | | | No functionality change intended, it's just a little more concise. llvm-svn: 221280
* [mips] Improve support for the .set mips16/nomips16 assembler directives.Toma Tabacu2014-11-041-6/+22
| | | | | | | | | | | | | | | | | | | Summary: Appropriately set/clear the FeatureBit for Mips16 when these assembler directives are used and also emit ".set nomips16" (previously, only ".set mips16" was being emitted). These improvements allow for better testing of the .cpload/.cprestore assembler directives (which are not supposed to work when Mips16 is enabled). Test Plan: The test is bare-bones because there are no MC tests for Mips16 instructions (there's only one, which checks that the Mips16 ELF header flag gets set), and that suggests to me that it has not been implemented yet in the IAS. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5462 llvm-svn: 221277
OpenPOWER on IntegriCloud