summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* PR1860 - We can't save a list of ExtractElement instructions to CSE because ↵Nadav Rotem2013-11-261-16/+11
| | | | | | | | some of these instructions may be removed and optimized in future iterations. Instead we save a list of basic blocks that we need to CSE. llvm-svn: 195791
* 80-column fixups.Eric Christopher2013-11-263-3/+7
| | | | llvm-svn: 195790
* [AArch64] Add support for NEON scalar floating-point to integer convertChad Rosier2013-11-261-1/+63
| | | | | | instructions. llvm-svn: 195788
* LoopVectorizer: Truncate i64 trip counts of i32 phis if necessaryArnold Schwaighofer2013-11-261-0/+9
| | | | | | | | | | | In signed arithmetic we could end up with an i64 trip count for an i32 phi. Because it is signed arithmetic we know that this is only defined if the i32 does not wrap. It is therefore safe to truncate the i64 trip count to a i32 value. Fixes PR18049. llvm-svn: 195787
* Fix a bug related to constant islands for Mips16 and mips16/32 dual mode.Reed Kotler2013-11-261-3/+2
| | | | | | | The determination of when we are doing constant pools was being made too early in the asm printer. llvm-svn: 195781
* Refactor some code in SampleProfile.cppDiego Novillo2013-11-261-99/+112
| | | | | | | | | | | | | | | I'm adding new functionality in the sample profiler. This will require more data to be kept around for each function, so I moved the structure SampleProfile that we keep for each function into a separate class. There are no functional changes in this patch. It simply provides a new home where to place all the new data that I need to propagate weights through edges. There are some other name and minor edits throughout. llvm-svn: 195780
* Fix PR18054Michael Liao2013-11-261-7/+15
| | | | | | | | - Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG lowering where we need to check whether x is a vector type (in-reg type) of i8, i16 or i32; otherwise, that optimization is not valid. llvm-svn: 195779
* DwarfDebug: Include type units in accelerator tables.David Blaikie2013-11-262-25/+31
| | | | | | | Since type units aren't in the CUMap, use the DwarfUnits list to iterate over units for tasks such as accelerator table building. llvm-svn: 195776
* Fix spurious return introduced by my earlier patch to DebugInfoRenato Golin2013-11-261-1/+0
| | | | llvm-svn: 195775
* PR18060 - When we RAUW values with ExtractElement instructions in some casesNadav Rotem2013-11-261-0/+8
| | | | | | | | we generate PHI nodes with multiple entries from the same basic block but with different values. Enabling CSE on ExtractElement instructions make sure that all of the RAUWed instructions are the same. llvm-svn: 195773
* Add return to DIType::VerifyRenato Golin2013-11-261-3/+3
| | | | | | | | Code scanner ran by Sylvestre Ledru got a no_return bug in DebugInfo.cpp. Adding the return statements that should be there. llvm-svn: 195772
* PR17925 bugfix.Stepan Dyatkovskiy2013-11-261-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Short description. This issue is about case of treating pointers as integers. We treat pointers as different if they references different address space. At the same time, we treat pointers equal to integers (with machine address width). It was a point of false-positive. Consider next case on 32bit machine: void foo0(i32 addrespace(1)* %p) void foo1(i32 addrespace(2)* %p) void foo2(i32 %p) foo0 != foo1, while foo1 == foo2 and foo0 == foo2. As you can see it breaks transitivity. That means that result depends on order of how functions are presented in module. Next order causes merging of foo0 and foo1: foo2, foo0, foo1 First foo0 will be merged with foo2, foo0 will be erased. Second foo1 will be merged with foo2. Depending on order, things could be merged we don't expect to. The fix: Forbid to treat any pointer as integer, except for those, who belong to address space 0. llvm-svn: 195769
* Rename DwarfException methods so the new names are consistent with ↵Timur Iskhodzhanov2013-11-266-50/+50
| | | | | | DwarfDebug and the style guide llvm-svn: 195763
* Darwin-ARM: use movw/movt for static relocationsTim Northover2013-11-262-8/+4
| | | | llvm-svn: 195759
* [PM] Factor the overwhelming majority of the interface boiler plate outChandler Carruth2013-11-261-49/+42
| | | | | | | | | | | | | | of the two analysis managers into a CRTP base class that can be shared and re-used in building any analysis manager. This will in turn simplify adding yet another analysis manager to the system. The base class provides all of the interface sugar for the analysis manager delegating the functionality back through DerivedT methods which operate on simple pass IDs. It also provides the pass registration, storage, and lookup system which is common across the various formulations of analysis managers. llvm-svn: 195747
* [SystemZ] Fix incorrect use of RISBG for a zero-extended right shiftRichard Sandiford2013-11-261-19/+8
| | | | | | | | | We would wrongly transform the testcase into the equivalent of an AND with 1. The problem was that, when testing whether the shifted-in bits of the right shift were significant, we used the width of the final zero-extended result rather than the width of the shifted value. llvm-svn: 195731
* [PM] Split the CallGraph out from the ModulePass which creates theChandler Carruth2013-11-2612-93/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CallGraph. This makes the CallGraph a totally generic analysis object that is the container for the graph data structure and the primary interface for querying and manipulating it. The pass logic is separated into its own class. For compatibility reasons, the pass provides wrapper methods for most of the methods on CallGraph -- they all just forward. This will allow the new pass manager infrastructure to provide its own analysis pass that constructs the same CallGraph object and makes it available. The idea is that in the new pass manager, the analysis pass's 'run' method returns a concrete analysis 'result'. Here, that result is a 'CallGraph'. The 'run' method will typically do only minimal work, deferring much of the work into the implementation of the result object in order to be lazy about computing things, but when (like DomTree) there is *some* up-front computation, the analysis does it prior to handing the result back to the querying pass. I know some of this is fairly ugly. I'm happy to change it around if folks can suggest a cleaner interim state, but there is going to be some amount of unavoidable ugliness during the transition period. The good thing is that this is very limited and will naturally go away when the old pass infrastructure goes away. It won't hang around to bother us later. Next up is the initial new-PM-style call graph analysis. =] llvm-svn: 195722
* [PM] Reformat some code with clang-format as I'm going to be editting asChandler Carruth2013-11-261-21/+12
| | | | | | | part of generalizing the call graph infrastructure for the new pass manager. llvm-svn: 195718
* Refactored the implementation of AArch64 NEON instruction ZIP, UZPKevin Qin2013-11-263-328/+226
| | | | | | | and TRN. Fix a bug when mixed use of vget_high_u8() and vuzp_u8(). llvm-svn: 195716
* [AArch64]Implement 128 bit register copy with NEON.Kevin Qin2013-11-261-17/+19
| | | | llvm-svn: 195713
* StackMap: Implement support for DirectMemRefOp.Andrew Trick2013-11-264-10/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A Direct stack map location records the address of frame index. This address is itself the value that the runtime requested. This differs from IndirectMemRefOp locations, which refer to a stack locations from which the requested values must be loaded. Direct locations can directly communicate the address if an alloca, while IndirectMemRefOp handle register spills. For example: entry: %a = alloca i64... llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a) Since both the alloca and stackmap intrinsic are in the entry block, and the intrinsic takes the address of the alloca, the runtime can assume that LLVM will not substitute alloca with any intervening value. This must be verified by the runtime by checking that the stack map's location is a Direct location type. The runtime can then determine the alloca's relative location on the stack immediately after compilation, or at any time thereafter. This differs from Register and Indirect locations, because the runtime can only read the values in those locations when execution reaches the instruction address of the stack map. llvm-svn: 195712
* whitespaceAndrew Trick2013-11-261-5/+5
| | | | llvm-svn: 195711
* Lift self-copy protection up to the header file and add self-moveChandler Carruth2013-11-261-2/+3
| | | | | | | | | | | | protection to the same layer. This is in line with Howard's advice on how best to handle self-move assignment as he explained on SO[1]. It also ensures that implementing swap with move assignment continues to work in the case of self-swap. [1]: http://stackoverflow.com/questions/9322174/move-assignment-operator-and-if-this-rhs llvm-svn: 195705
* Fix a self-memcpy which only breaks under Valgrind's memcpyChandler Carruth2013-11-261-0/+3
| | | | | | | implementation. Silliness, but it'll be a trivial performance optimization. This should clear up a failure on the vg_leak bot. llvm-svn: 195704
* [PM] Rename the 'Mod' member to the more idiomatic 'M'. No functionalityChandler Carruth2013-11-261-3/+3
| | | | | | changed. llvm-svn: 195701
* DebugInfo: Remove CompileUnit::constructTypeDIEImpl now that it's just a ↵David Blaikie2013-11-262-15/+2
| | | | | | | | | simple wrapper again. r195698 moved the type unit checking up into getOrCreateTypeDIE so remove the redundant check and fold the functions back together again. llvm-svn: 195700
* DebugInfo: Avoid emitting pubtype entries for type DIEs that just indirect ↵David Blaikie2013-11-262-57/+65
| | | | | | to a type unit. llvm-svn: 195698
* Add an intrinsic for the SSE2 PAUSE instruction.Cameron McInally2013-11-261-1/+3
| | | | llvm-svn: 195697
* DebugInfo: Pubtypes: Coelesce pubtype registration with accelerator type ↵David Blaikie2013-11-263-49/+13
| | | | | | | | | | registration. It might be possible to eventually use one data structure, but I haven't looked at the exact criteria used for accelerator tables and pubtypes to see if there's good reason for the differences between the two or not. llvm-svn: 195696
* Do the string comparison in the constructor instead of once per nop.Rafael Espindola2013-11-251-6/+9
| | | | | | Thanks to Roman Divacky for the suggestion. llvm-svn: 195684
* Don't use nopl in cpus that don't support it.Rafael Espindola2013-11-251-1/+5
| | | | | | | | | | | | | | | Patch by Mikulas Patocka. I added the test. I checked that for cpu names that gas knows about, it also doesn't generate nopl. The modified cpus: i686 - there are i686-class CPUs that don't have nopl: Via c3, Transmeta Crusoe, Microsoft VirtualBox - see https://bbs.archlinux.org/viewtopic.php?pid=775414 k6, k6-2, k6-3, winchip-c6, winchip2 - these are 586-class CPUs via c3 c3-2 - see https://bugs.archlinux.org/task/19733 as a proof that Via c3 and c3-Nehemiah don't have nopl llvm-svn: 195679
* ARM integrated assembler generates incorrect nop opcodeDavid Peixotto2013-11-251-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug in the assembler that was causing bad code to be emitted. When switching modes in an assembly file (e.g. arm to thumb mode) we would always emit the opcode from the original mode. Consider this small example: $ cat align.s .code 16 foo: add r0, r0 .align 3 add r0, r0 $ llvm-mc -triple armv7-none-linux align.s -filetype=obj -o t.o $ llvm-objdump -triple thumbv7 -d t.o Disassembly of section .text: foo: 0: 00 44 add r0, r0 2: 00 f0 20 e3 blx #4195904 6: 00 00 movs r0, r0 8: 00 44 add r0, r0 This shows that we have actually emitted an arm nop (e320f000) instead of a thumb nop. Unfortunately, this encodes to a thumb branch which causes bad things to happen when compiling assembly code with align directives. The fix is to notify the ARMAsmBackend when we switch mode. The MCMachOStreamer was already doing this correctly. This patch makes the same change for the MCElfStreamer. There is still a bug in the way nops are emitted for alignment because the MCAlignment fragment does not store the correct mode. The ARMAsmBackend will emit nops for the last mode it knew about. In the example above, we still generate an arm nop if we add a `.code 32` to the end of the file. PR18019 llvm-svn: 195677
* Unrevert r195599 with testcase fix.Bill Wendling2013-11-251-0/+5
| | | | | | | I'm not sure how it was checking for the wrong values... PR18023. llvm-svn: 195670
* Fix indentation typoTim Northover2013-11-251-1/+1
| | | | llvm-svn: 195660
* ARM: remove special cases for Darwin dynamic-no-pic mode.Tim Northover2013-11-2511-104/+73
| | | | | | | | | These are handled almost identically to static mode (and ELF's global address materialisation), except that a symbol may have "$non_lazy_ptr" appended. This can be handled by passing appropriate flags along with the instruction instead of using entirely separate pseudo-instructions. llvm-svn: 195655
* Fix .comm and .lcomm on COFF.Rafael Espindola2013-11-251-18/+2
| | | | | | | | | | | | | | | | These should not use COMDATs. GNU as uses .bss for .lcomm and section 0 for .comm. Given static int a; int b; MSVC puts both in .bss. This patch then puts both .comm and .lcomm on .bss. With this change we agree with gas on .lcomm, are much closer on .comm and clang-cl matches msvc on the above example. llvm-svn: 195654
* Refactor to make the .bss, .data and .text sections available for other uses.Rafael Espindola2013-11-251-19/+22
| | | | | | No functionality change. llvm-svn: 195653
* Make helper function static.Benjamin Kramer2013-11-251-2/+3
| | | | llvm-svn: 195650
* ARM: remove unused patterns.Tim Northover2013-11-253-6/+1
| | | | | | | | There is no sane way for an LEApcrel (= single ADR) instruction to generate a global address on any ARM target I know of. Fortunately, no-one was trying to any more, but there were vestigial patterns. llvm-svn: 195644
* [ARM] Enable FeatureMP for Cortex-A5 by default.Amara Emerson2013-11-251-1/+1
| | | | | | Patch by Oliver Stannard. llvm-svn: 195640
* Revert r195599 as it broke the builds.Amara Emerson2013-11-251-5/+0
| | | | llvm-svn: 195636
* Fixed tryFoldToZero() for vector types that need expansion.Daniel Sanders2013-11-252-15/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: Moved the requirement for SelectionDAG::getConstant() to return legally typed nodes slightly earlier. There were two optional DAGCombine passes that were missed out and were required to produce type-legal DAGs. Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant(). This provides support for both promoted and expanded vector types whereas the previous code only supported promoted vector types. Fixes a "Type for zero vector elements is not legal" assertion detected by an llvm-stress generated test. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2251 llvm-svn: 195635
* X86: enable AVX2 under Haswell native compilationTim Northover2013-11-252-7/+96
| | | | | | Patch by Adam Strzelecki llvm-svn: 195632
* Don't look past volatile loads.Bill Wendling2013-11-251-0/+5
| | | | | | | A volatile load should block us from trying to coalesce stores. PR18023 llvm-svn: 195599
* Fixed a bug about disassembling AArch64 post-index load/store single element ↵Hao Liu2013-11-251-9/+14
| | | | | | | | | | instructions. ie. echo "0x00 0x04 0x80 0x0d" | ../bin/llvm-mc -triple=aarch64 -mattr=+neon -disassemble echo "0x00 0x00 0x80 0x0d" | ../bin/llvm-mc -triple=aarch64 -mattr=+neon -disassemble will be disassembled into the same instruction st1 {v0b}[0], [x0], x0. llvm-svn: 195591
* SparcFrameLowering.cpp: Prune 'DL' [-Wunused-variable]NAKAMURA Takumi2013-11-251-1/+0
| | | | llvm-svn: 195590
* Output a bit more information in the debug printing for MBP. This wasChandler Carruth2013-11-251-3/+4
| | | | | | useful when analyzing parts of zlib's behavior here. llvm-svn: 195588
* [Sparc] Emit large negative adjustments to SP/FP with sethi+xor instead of ↵Venkatraman Govindaraju2013-11-244-40/+108
| | | | | | sethi+or. This generates correct code for both sparc32 and sparc64. llvm-svn: 195576
* [Sparc]: Implement LEA pattern for sparcv9.Venkatraman Govindaraju2013-11-242-4/+11
| | | | llvm-svn: 195575
* [SparcV9]: Do not emit .register directives for global registers that are ↵Venkatraman Govindaraju2013-11-241-1/+1
| | | | | | clobbered by calls but not used in the function itself. llvm-svn: 195574
OpenPOWER on IntegriCloud