summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* DebugInfo: Support debug_loc under fissionDavid Blaikie2014-03-257-15/+130
| | | | | | | | | | | | | | | | | | | | | | Implement debug_loc.dwo, as well as llvm-dwarfdump support for dumping this section. Outlined in the DWARF5 spec and http://gcc.gnu.org/wiki/DebugFission the debug_loc.dwo section has more variation than the standard debug_loc, allowing 3 different forms of entry (plus the end of list entry). GCC seems to, and Clang certainly, only use one form, so I've just implemented dumping support for that for now. It wasn't immediately obvious that there was a good refactoring to share the implementation of dumping support between debug_loc and debug_loc.dwo, so they're separate for now - ideas welcome or I may come back to it at some point. As per a comment in the code, we could choose different forms that may reduce the number of debug_addr entries we emit, but that will require further study. llvm-svn: 204697
* DebugInfo: Remove unnecessary zero-size checkDavid Blaikie2014-03-251-3/+0
| | | | | | | | | This seems excessive - switching section isn't expensive (or if it is we're already being wasteful, since we emitted the debug_loc section symbol earlier anyway) and otherwise there's no work that happens in this function when the list is empty. llvm-svn: 204696
* Register Allocator: check other options before using a CSR for the first time.Manman Ren2014-03-251-6/+57
| | | | | | | | | | | | | | | | | | | When register allocator's stage is RS_Spill, we choose spill over using the CSR for the first time, if the spill cost is lower than CSRCost. When register allocator's stage is < RS_Split, we choose pre-splitting over using the CSR for the first time, if the cost of splitting is lower than CSRCost. CSRCost is set with command-line option "regalloc-csr-first-time-cost". The default value is 0 to generate the same codes as before this commit. With a value of 15 (1 << 14 is the entry frequency), I measured performance gain of 3% on 253.perlbmk and 1.7% on 197.parser, with instrumented PGO, on an arm device. rdar://16162005 llvm-svn: 204690
* Fix crashes when assembler directives are used that are notKevin Enderby2014-03-251-1/+64
| | | | | | | | for Mach-O object files by generating an error instead. rdar://16335232 llvm-svn: 204687
* Register Allocator: refactoring (no functionality change).Manman Ren2014-03-241-6/+30
| | | | | | | | | Factor out two functions calculateRegionSplitCost and doRegionSplit from tryRegionSplit. These two functions will be used in coming patches. rdar://16162005 llvm-svn: 204684
* DebugInfo: Simplify debug loc list handling by keeping separate listsDavid Blaikie2014-03-243-33/+17
| | | | | | | Rather than using a flat list with "empty" entries (ala the actual on-disk format), keep separate lists for each variable. llvm-svn: 204680
* DwarfDebug: Simplify debug_loc mergingDavid Blaikie2014-03-243-29/+13
| | | | | | | | | | No functional change intended. Merging up-front rather than delaying this task until later. This just seems simpler and more efficient (avoiding growing the debug loc list only to have to skip over those post-merged entries, etc). llvm-svn: 204679
* Get rid of an unnecessary use of the * and & operators.Adrian Prantl2014-03-241-1/+1
| | | | llvm-svn: 204673
* DebugInfo: Add DW_AT_GNU_ranges_base to skeleton CUsDavid Blaikie2014-03-241-4/+9
| | | | | | | | | This is used to avoid relocations in the dwo file by allowing DW_AT_ranges specified in debug_info.dwo to be relative to this base address. (r204667 implements the base-relative DW_AT_ranges side of this) llvm-svn: 204672
* DebugInfo: Implement relative addressing for DW_AT_ranges under fissionDavid Blaikie2014-03-241-2/+9
| | | | | | | | This removes the debug_ranges relocations from debug_info.dwo (but doesn't implement the DW_AT_GNU_ranges_base which is also necessary for correct functioning) llvm-svn: 204668
* DebugInfo: Don't emit relocations to abbreviations in debug_info.dwoDavid Blaikie2014-03-242-2/+6
| | | | llvm-svn: 204667
* DwarfDebug: Remove an unused parameterDavid Blaikie2014-03-242-9/+4
| | | | llvm-svn: 204665
* R600: Don't viewCFG() under DEBUG() except on failure.Matt Arsenault2014-03-241-9/+6
| | | | | | | Having these popping up every time you use -debug is really irritating. llvm-svn: 204664
* Remove unused parameterDavid Blaikie2014-03-243-10/+6
| | | | llvm-svn: 204663
* R600/SI: Fix extra mov from legalizing 64-bit SALU ops.Matt Arsenault2014-03-241-14/+26
| | | | | | | Check the register class of each operand individually to avoid an extra copy to a vgpr. llvm-svn: 204662
* R600/SI: Sub-optimial fix for 64-bit immediates with SALU ops.Matt Arsenault2014-03-242-17/+44
| | | | | | | No longer asserts, but now you get moves loading legal immediates into the split 32-bit operations. llvm-svn: 204661
* R600/SI: Fix 64-bit bit ops that require the VALU.Matt Arsenault2014-03-243-1/+82
| | | | | | | | Try to match scalar and first like the other instructions. Expand 64-bit ands to a pair of 32-bit ands since that is not available on the VALU. llvm-svn: 204660
* In Release modes, Visual Studio complains that the Operator destructor in ↵Yaron Keren2014-03-241-0/+20
| | | | | | | | | | | | | | | | | | | | | User.cpp never returns, which is true by design. Initially assumed that the reason is llvm_unreachable being dependent on NDEBUG. However, even if llvm_unreachable is replaced by __assume(false), VC still warns in Release modes but not in Debug modes... The real reason turned out to be optimization flags. With /Od in Debug modes the warning is not issued whereas with /O1 it is. I could not find any documentation to this effect, but it is reproducable: Try compiling http://msdn.microsoft.com/en-us/library/khwfyc5d(v=vs.90).aspx with /O1 and then with /Od. llvm-svn: 204659
* R600: Implement isNarrowingProfitable.Matt Arsenault2014-03-242-0/+12
| | | | llvm-svn: 204658
* R600/SI: Move splitting 64-bit immediates to separate function.Matt Arsenault2014-03-242-39/+59
| | | | llvm-svn: 204651
* [PowerPC] Generate little-endian object filesUlrich Weigand2014-03-245-24/+55
| | | | | | | | | | | | | | | | | | | | As a first step towards real little-endian code generation, this patch changes the PowerPC MC layer to actually generate little-endian object files. This involves passing the little-endian flag through the various layers, including down to createELFObjectWriter so we actually get basic little-endian ELF objects, emitting instructions in little-endian order, and handling fixups and relocations as appropriate for little-endian. The bulk of the patch is to update most test cases in test/MC/PowerPC to verify both big- and little-endian encodings. (The only test cases *not* updated are those that create actual big-endian ABI code, like the TLS tests.) Note that while the object files are now little-endian, the generated code itself is not yet updated, in particular, it still does not adhere to the ELFv2 ABI. llvm-svn: 204634
* [X86][ISelDAG] Add missing fallback patterns for avx2 broadcast instructions.Quentin Colombet2014-03-241-0/+25
| | | | | | | | | | | | | Those patterns are used when the load cannot be folded into the related broadcast during the select phase. This happens when the load gets additional uses that were not anticipated during the previous lowering phases (constant vector to constant load, then constant load reused) or when selection DAG is not able to prove that folding the load will not create a cycle in the DAG. <rdar://problem/16074331> llvm-svn: 204631
* R600/SI: Fix 64-bit private loads.Matt Arsenault2014-03-241-1/+17
| | | | llvm-svn: 204630
* [X86] Fix non-determinism in LowerVectorAllZeroTestAdam Nemet2014-03-241-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | This can be observed with the old testcase of CodeGen/X86/pr12312.ll: 47c47 < vorps %ymm0, %ymm1, %ymm0 --- > vorps %ymm1, %ymm0, %ymm0 97c97 < vorps %ymm1, %ymm0, %ymm0 --- > vorps %ymm0, %ymm1, %ymm0 The vector VecIns is populated with all the values from VecInMap. This is done while iterating VecInMap. VecInMap uses a hash of pointer values so the resulting order can vary depending on the memory layout. The fix is to populate the vector VecIns earlier as VecInMap is populated. This is done in DAG traversal order. Fixes <rdar://problem/16398806> llvm-svn: 204623
* [mips] Add error message when trying to use $at in '.set noat' mode.Daniel Sanders2014-03-241-1/+6
| | | | | | | | | | Summary: Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3158 llvm-svn: 204621
* Removes the NVPTXSplitBBatBar pass.Eli Bendersky2014-03-244-118/+0
| | | | | | | This pass is a historic remnant and actually causes less efficient code to be generated in some cases. llvm-svn: 204620
* R600/SI: Fix warning with gcc 4.8.2Tom Stellard2014-03-241-1/+1
| | | | llvm-svn: 204618
* R600/SI: Promote fp64 SELECT to i64Tom Stellard2014-03-242-12/+2
| | | | | | | This type promotion is replacing a Tablegen pattern and it is already covered by existing tests. llvm-svn: 204617
* SelectionDAG: Allow promotion of SELECT nodes from float to int typesTom Stellard2014-03-241-1/+2
| | | | | | | | And vice-versa, as long as the types are the same width. There are a few R600 tests that will cover this. llvm-svn: 204616
* R600: Reorganize tablegen instruction definitionsTom Stellard2014-03-245-781/+826
| | | | | | Each GPU family now has its own file. llvm-svn: 204615
* [PPC64LE] ELFv2 ABI updates for the .opd sectionWill Schmidt2014-03-241-0/+5
| | | | | | | | | | | | | | | | | | [PPC64LE] ELFv2 ABI updates for the .opd section The PPC64 Little Endian (PPC64LE) target supports the ELFv2 ABI, and as such, does not have a ".opd" section. This is keyed off a _CALL_ELF=2 macro check. The CALL_ELF check is not clearly documented at this time. The basis for usage in this patch is from the gcc thread here: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01144.html > Adding comment from Uli: Looks good to me. I think the old-style JIT doesn't really work anyway for 64-bit, but at least with this patch LLVM will compile and link again on a ppc64le host ... llvm-svn: 204614
* [mips] Allow dsubu to take an immediate as an alias for dsubiu.Daniel Sanders2014-03-241-0/+3
| | | | | | | | | | Summary: Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3155 llvm-svn: 204611
* [PowerPC] Mark many instructions as commutativeHal Finkel2014-03-244-4/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm under the impression that we used to infer the isCommutable flag from the instruction-associated pattern. Regardless, we don't seem to do this (at least by default) any more. I've gone through all of our instruction definitions, and marked as commutative all of those that should be trivial to commute (by exchanging the first two operands). There has been special code for the RL* instructions, and that's not changed. Before this change, we had the following commutative instructions: RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo XSADDDP XSMULDP XVADDDP XVADDSP XVMULDP XVMULSP After: ADD4 ADD4o ADD8 ADD8o ADDC ADDC8 ADDC8o ADDCo ADDE ADDE8 ADDE8o ADDEo AND AND8 AND8o ANDo CRAND CREQV CRNAND CRNOR CROR CRXOR EQV EQV8 EQV8o EQVo FADD FADDS FADDSo FADDo FMADD FMADDS FMADDSo FMADDo FMSUB FMSUBS FMSUBSo FMSUBo FMUL FMULS FMULSo FMULo FNMADD FNMADDS FNMADDSo FNMADDo FNMSUB FNMSUBS FNMSUBSo FNMSUBo MULHD MULHDU MULHDUo MULHDo MULHW MULHWU MULHWUo MULHWo MULLD MULLDo MULLW MULLWo NAND NAND8 NAND8o NANDo NOR NOR8 NOR8o NORo OR OR8 OR8o ORo RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo VADDCUW VADDFP VADDSBS VADDSHS VADDSWS VADDUBM VADDUBS VADDUHM VADDUHS VADDUWM VADDUWS VAND VAVGSB VAVGSH VAVGSW VAVGUB VAVGUH VAVGUW VMADDFP VMAXFP VMAXSB VMAXSH VMAXSW VMAXUB VMAXUH VMAXUW VMHADDSHS VMHRADDSHS VMINFP VMINSB VMINSH VMINSW VMINUB VMINUH VMINUW VMLADDUHM VMULESB VMULESH VMULEUB VMULEUH VMULOSB VMULOSH VMULOUB VMULOUH VNMSUBFP VOR VXOR XOR XOR8 XOR8o XORo XSADDDP XSMADDADP XSMAXDP XSMINDP XSMSUBADP XSMULDP XSNMADDADP XSNMSUBADP XVADDDP XVADDSP XVMADDADP XVMADDASP XVMAXDP XVMAXSP XVMINDP XVMINSP XVMSUBADP XVMSUBASP XVMULDP XVMULSP XVNMADDADP XVNMADDASP XVNMSUBADP XVNMSUBASP XXLAND XXLNOR XXLOR XXLXOR This is a by-inspection change, and I'm not sure how to write a reliable test case. I would like advice on this, however. llvm-svn: 204609
* [mips] Implement shorthand add / sub forms for MIPS.Daniel Sanders2014-03-243-1/+94
| | | | | | | | | | | | | | | | | | | | | Summary: - If only two registers are passed to a three-register operation, then the first argument is both source and destination register. - If a non-register is passed as the last argument, generate the immediate version of the instruction. Also mark DADD commutative and add scheduling information (to the generic scheduler), and implement DSUB. Patch by David Chisnall His work was sponsored by: DARPA, AFRL CC: theraven Differential Revision: http://llvm-reviews.chandlerc.com/D3148 llvm-svn: 204605
* [NVPTX] Add isel patterns for addrspacecastJustin Holewinski2014-03-242-0/+64
| | | | llvm-svn: 204600
* [PowerPC] Don't schedule VSX copy legalization unless VSX is enabledHal Finkel2014-03-241-1/+2
| | | | | | There is no need to schedule this extra pass if it will have nothing to do. llvm-svn: 204594
* [PowerPC] Update comment re: VSX copy-instruction selectionHal Finkel2014-03-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've done some experimentation with this, and it looks like using the lower-latency (but lower throughput) copy instruction is essentially always the right thing to do. My assumption is that, in order to be relatively sure that the higher-latency copy will increase throughput, we'd want to have it unlikely to be in-flight with its use. On the P7, the global completion table (GCT) can hold a maximum of 120 instructions, shared among all active threads (up to 4), giving 30 instructions per thread. So specifically, I'd require at least that many instructions between the copy and the use before the high-latency variant is used. Trying this, however, over the entire test suite resulted in zero cases where the high-latency form would be preferable. This may be a consequence of the fact that the scheduler views copies as free, and so they tend to end up close to their uses. For this experiment I created a function: unsigned chooseVSXCopy(MachineBasicBlock &MBB, MachineBasicBlock::iterator I, unsigned DestReg, unsigned SrcReg, unsigned StartDist = 1, unsigned Depth = 3) const; with an implementation like: if (!Depth) return PPC::XXLOR; const unsigned MaxDist = 30; unsigned Dist = StartDist; for (auto J = I, JE = MBB.end(); J != JE && Dist <= MaxDist; ++J) { if (J->isTransient() && !J->isCopy()) continue; if (J->isCall() || J->isReturn() || J->readsRegister(DestReg, TRI)) return PPC::XXLOR; ++Dist; } // We've exceeded the required distance for the high-latency form, use it. if (Dist > MaxDist) return PPC::XVCPSGNDP; // If this is only an exit block, use the low-latency form. if (MBB.succ_empty()) return PPC::XXLOR; // We've reached the end of the block, check the successor blocks (up to some // depth), and use the high-latency form if that is okay with all successors. for (auto J = MBB.succ_begin(), JE = MBB.succ_end(); J != JE; ++J) { if (chooseVSXCopy(**J, (*J)->begin(), DestReg, SrcReg, Dist, --Depth) == PPC::XXLOR) return PPC::XXLOR; } // All of our successor blocks seem okay with the high-latency variant, so // we'll use it. return PPC::XVCPSGNDP; and then changed the copy opcode selection from: Opc = PPC::XXLOR; to: Opc = chooseVSXCopy(MBB, std::next(I), DestReg, SrcReg); In conclusion, I'm removing the FIXME from the comment, because I believe that there is, at least absent other examples, nothing to fix. llvm-svn: 204591
* Allow constant folding of ceil function whenever feasibleKarthik Bhat2014-03-241-0/+3
| | | | llvm-svn: 204583
* Propagate section from base to derived symbol.Rafael Espindola2014-03-241-18/+19
| | | | | | | | | | | | We were already propagating the section in a = b With this patch we also propagate it for a = b + 1 llvm-svn: 204581
* InstrProf: Silence spurious warnings in GCC 4.8Duncan P. N. Exon Smith2014-03-241-9/+13
| | | | | | No functionality change. llvm-svn: 204580
* ARM: no need to update SplatBits as it is not usedArnaud A. de Grandmaison2014-03-231-3/+0
| | | | llvm-svn: 204575
* WinCOFF: Add support for -ffunction-sectionsDavid Majnemer2014-03-231-4/+9
| | | | | | | | This is a pretty straight forward translation for COFF, we just need to stick the function in a COMDAT section marked as IMAGE_COMDAT_SELECT_NODUPLICATES. llvm-svn: 204565
* remove a bunch of unused private methodsNuno Lopes2014-03-2321-255/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | found with a smarter version of -Wunused-member-function that I'm playwing with. Appologies in advance if I removed someone's WIP code. include/llvm/CodeGen/MachineSSAUpdater.h | 1 include/llvm/IR/DebugInfo.h | 3 lib/CodeGen/MachineSSAUpdater.cpp | 10 -- lib/CodeGen/PostRASchedulerList.cpp | 1 lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp | 10 -- lib/IR/DebugInfo.cpp | 12 -- lib/MC/MCAsmStreamer.cpp | 2 lib/Support/YAMLParser.cpp | 39 --------- lib/TableGen/TGParser.cpp | 16 --- lib/TableGen/TGParser.h | 1 lib/Target/AArch64/AArch64TargetTransformInfo.cpp | 9 -- lib/Target/ARM/ARMCodeEmitter.cpp | 12 -- lib/Target/ARM/ARMFastISel.cpp | 84 -------------------- lib/Target/Mips/MipsCodeEmitter.cpp | 11 -- lib/Target/Mips/MipsConstantIslandPass.cpp | 12 -- lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp | 21 ----- lib/Target/NVPTX/NVPTXISelDAGToDAG.h | 2 lib/Target/PowerPC/PPCFastISel.cpp | 1 lib/Transforms/Instrumentation/AddressSanitizer.cpp | 2 lib/Transforms/Instrumentation/BoundsChecking.cpp | 2 lib/Transforms/Instrumentation/MemorySanitizer.cpp | 1 lib/Transforms/Scalar/LoopIdiomRecognize.cpp | 8 - lib/Transforms/Scalar/SCCP.cpp | 1 utils/TableGen/CodeEmitterGen.cpp | 2 24 files changed, 2 insertions(+), 261 deletions(-) llvm-svn: 204560
* [PowerPC] Make use of VSX f64 <-> i64 conversion instructionsHal Finkel2014-03-231-6/+12
| | | | | | | When VSX is available, these instructions should be used in preference to the older variants that only have access to the scalar floating-point registers. llvm-svn: 204559
* Revert r204076 for now - it caused significant regressions in a number ofLang Hames2014-03-231-47/+78
| | | | | | | | benchmarks. <rdar://problem/16368461> llvm-svn: 204558
* InstrProf: Check pointer size in raw profileDuncan P. N. Exon Smith2014-03-231-11/+40
| | | | | | | | | | | | | | | Since the profile can come from 32-bit machines, we need to check the pointer size. Change the magic number to facilitate this. Adds tests for reading 32-bit and 64-bit binaries (both big- and little-endian). The tests write a binary using printf in RUN lines (like raw-magic-but-no-header.test). Assuming the bots don't complain, this seems like a better way forward for testing RawInstrProfReader than committing binary files. <rdar://problem/16400648> llvm-svn: 204557
* Propagate types from symbol to aliases.Rafael Espindola2014-03-231-1/+22
| | | | | | | | | | | | | | | | | | | | | | | This is similar, but not identical to what gas does. The logic in MC is to just compute the symbol table after parsing the entire file. GAS is mixed, given .type b, @object a = b b: .type b, @function It will propagate the change and make 'a' a function. Given .type b, @object b: a = b .type b, @function the type of 'a' is still object. Since we do the computation in the end, we produce a function in both cases. llvm-svn: 204555
* [CMake] LLVMProfileData: No need to add LINK_LIBS here. LLVMBuild should do.NAKAMURA Takumi2014-03-231-4/+1
| | | | llvm-svn: 204553
* Prune includes in ARM target.Craig Topper2014-03-2232-55/+25
| | | | llvm-svn: 204548
* ARM IAS: properly handle function entries in .thumbSaleem Abdulrasool2014-03-222-0/+43
| | | | | | | | | | | | | | | | | | | | When a label is parsed, check if there is type information available for the label. If so, check if the symbol is a function. If the symbol is a function and we are in thumb mode and no explicit thumb_func has been emitted, adjust the symbol data to indicate that the function definition is a thumb function. The application of this inferencing is improved value handling in the object file (the required thumb bit is set on symbols which are thumb functions). It also helps improve compatibility with binutils. The one complication that arises from this handling is the MCAsmStreamer. The default implementation of getOrCreateSymbolData in MCStreamer does not support tracking the symbol data. In order to support the semantics of thumb functions, track symbol data in assembly streamer. Although O(n) in number of labels in the TU, this is already done in various other streamers and as such the memory overhead is not a practical concern in this scenario. llvm-svn: 204544
OpenPOWER on IntegriCloud