summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* PPC: Initial support for permutation-based unaligned Altivec loadsHal Finkel2013-05-241-0/+129
| | | | | | | | | | | | | | | | | | Altivec only directly supports aligned loads, but the loads have a strange property: If given an unaligned address, they truncate the address to the next lower aligned address, and load from there. This property, along with an extra load and some special-purpose permutation-control instructions that generate the appropriate permutations from the original unaligned address, allow efficient lowering of aligned loads. This code uses the trick explained in the Apple Velocity Engine optimization overview document to prevent the needed extra load from possibly causing a page fault if the original address happens to be aligned. As noted in the FIXMEs, there are several additional optimizations that can be performed to reduce the cost of these loads even more. These will be implemented in future commits. llvm-svn: 182691
* Follow up of the introduction of MCSymbolizer.Quentin Colombet2013-05-246-12/+37
| | | | | | | - Ressurect old MCDisassemble API to soften transition. - Extend MCTargetDesc to set target specific symbolizer. llvm-svn: 182688
* Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.Michael J. Spencer2013-05-2433-75/+75
| | | | llvm-svn: 182680
* Add missing header for atexit.Michael J. Spencer2013-05-241-0/+2
| | | | llvm-svn: 182672
* [objc-arc] KnownSafe does not imply that it is safe to perform code motion ↵Michael Gottesman2013-05-241-8/+40
| | | | | | | | across CFG edges since even if it is safe to remove RR pairs, we may still be able to move a retain/release into a loop. rdar://13949644 llvm-svn: 182670
* [objc-arc] Make sure that multiple owners is propogated correctly through ↵Michael Gottesman2013-05-241-17/+16
| | | | | | | | the pass via the usage of a global data structure. rdar://13750319 llvm-svn: 182669
* LoopVectorize: LoopSimplify can't canonicalize loops with an indirectbr in ↵Benjamin Kramer2013-05-241-1/+4
| | | | | | | | it, don't assert on those cases. Fixes PR16139. llvm-svn: 182656
* Do not reserve space for the ColdEdges and NormalEdges vectors.Diego Novillo2013-05-241-2/+0
| | | | | | | Discussion and rationale at http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130520/175698.html llvm-svn: 182653
* [SystemZ] Improve AsmParser handling of invalid instructionsRichard Sandiford2013-05-241-74/+103
| | | | | | | | | | | | | | | | | | | | | | | Previously, an invalid instruction like: foo %r1, %r0 would generate the rather odd error message: ....: error: unknown token in expression foo %r1, %r0 ^ We now get the more informative: ....: error: invalid instruction foo %r1, %r0 ^ The same would happen if an address were used where a register was expected. We now get "invalid operand for instruction" instead. llvm-svn: 182644
* [SystemZ] Improve AsmParser register parsingRichard Sandiford2013-05-241-53/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea is to make sure that: (1) "register expected" is restricted to cases where ParseRegister() is called and the token obviously isn't a register. (2) "invalid register" is restricted to cases where a register-like "%..." sequence is found, but the "..." makes no sense. (3) the generic "invalid operand for instruction" is used in cases where the wrong register type is used (GPR instead of FPR, etc.). (4) the new "invalid register pair" is used if the register has the right type, but is not a valid register pair. Testing of (1)-(3) is now restricted to regs-bad.s. It uses a representative instruction for each register class to make sure that only registers from that class are accepted. (4) is tested by both regs-bad.s (which checks all invalid register pairs) and insn-bad.s (which tests one invalid pair for each instruction that requires a pair). While there, I changed "Number" to "Num" for consistency with the operand class. llvm-svn: 182643
* Run clang-format over the scalarizePHI function.Joey Gouly2013-05-241-12/+8
| | | | llvm-svn: 182640
* scalarizePHI needs to insert the next ExtractElement in the same blockJoey Gouly2013-05-241-2/+4
| | | | | | | | as the BinaryOperator, *not* in the block where the IRBuilder is currently inserting into. Fixes a bug where scalarizePHI would create instructions that would not dominate all uses. llvm-svn: 182639
* Add a new function attribute 'cold' to functions.Diego Novillo2013-05-246-1/+90
| | | | | | | | | | | Other than recognizing the attribute, the patch does little else. It changes the branch probability analyzer so that edges into blocks postdominated by a cold function are given low weight. Added analysis and code generation tests. Added documentation for the new attribute. llvm-svn: 182638
* Remove the Copied parameter from MemoryObject::readBytes.Benjamin Kramer2013-05-2410-33/+18
| | | | | | | | | | There was exactly one caller using this API right, the others were relying on specific behavior of the default implementation. Since it's too hard to use it right just remove it and standardize on the default behavior. Defines away PR16132. llvm-svn: 182636
* Fix unused warning in opt builds.Daniel Jasper2013-05-241-4/+3
| | | | | | | | In these builds, the asserts() are completely compiled out of the code leaving "End" unused. Directly accessing it, should not have a performance impact, as it is just a data member. llvm-svn: 182634
* MC: Disassembled CFG reconstruction.Ahmed Bougacha2013-05-2410-80/+456
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch builds on some existing code to do CFG reconstruction from a disassembled binary: - MCModule represents the binary, and has a list of MCAtoms. - MCAtom represents either disassembled instructions (MCTextAtom), or contiguous data (MCDataAtom), and covers a specific range of addresses. - MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is backed by an MCTextAtom, and has the usual successors/predecessors. - MCObjectDisassembler creates a module from an ObjectFile using a disassembler. It first builds an atom for each section. It can also construct the CFG, and this splits the text atoms into basic blocks. MCModule and MCAtom were only sketched out; MCFunction and MCBB were implemented under the experimental "-cfg" llvm-objdump -macho option. This cleans them up for further use; llvm-objdump -d -cfg now generates graphviz files for each function found in the binary. In the future, MCObjectDisassembler may be the right place to do "intelligent" disassembly: for example, handling constant islands is just a matter of splitting the atom, using information that may be available in the ObjectFile. Also, better initial atom formation than just using sections is possible using symbols (and things like Mach-O's function_starts load command). This brings two minor regressions in llvm-objdump -macho -cfg: - The printing of a relocation's referenced symbol. - An annotation on loop BBs, i.e., which are their own successor. Relocation printing is replaced by the MCSymbolizer; the basic CFG annotation will be superseded by more related functionality. llvm-svn: 182628
* Add MCSymbolizer for symbolic/annotated disassembly.Ahmed Bougacha2013-05-2418-205/+776
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a basic first step towards symbolization of disassembled instructions. This used to be done using externally provided (C API) callbacks. This patch introduces: - the MCSymbolizer class, that mimics the same functions that were used in the X86 and ARM disassemblers to symbolize immediate operands and to annotate loads based off PC (for things like c string literals). - the MCExternalSymbolizer class, which implements the old C API. - the MCRelocationInfo class, which provides a way for targets to translate relocations (either object::RelocationRef, or disassembler C API VariantKinds) to MCExprs. - the MCObjectSymbolizer class, which does symbolization using what it finds in an object::ObjectFile. This makes simple symbolization (with no fancy relocation stuff) work for all object formats! - x86-64 Mach-O and ELF MCRelocationInfos. - A basic ARM Mach-O MCRelocationInfo, that provides just enough to support the C API VariantKinds. Most of what works in otool (the only user of the old symbolization API that I know of) for x86-64 symbolic disassembly (-tvV) works, namely: - symbol references: call _foo; jmp 15 <_foo+50> - relocations: call _foo-_bar; call _foo-4 - __cf?string: leaq 193(%rip), %rax ## literal pool for "hello" Stub support is the main missing part (because libObject doesn't know, among other things, about mach-o indirect symbols). As for the MCSymbolizer API, instead of relying on the disassemblers to call the tryAdding* methods, maybe this could be done automagically using InstrInfo? For instance, even though PC-relative LEAs are used to get the address of string literals in a typical Mach-O file, a MOV would be used in an ELF file. And right now, the explicit symbolization only recognizes PC-relative LEAs. InstrInfo should have already have most of what is needed to know what to symbolize, so this can definitely be improved. I'd also like to remove object::RelocationRef::getValueString (it seems only used by relocation printing in objdump), as simply printing the created MCExpr is definitely enough (and cleaner than string concats). llvm-svn: 182625
* [PowerPC] Remove symbolLo/symbolHi instruction operand typesUlrich Weigand2013-05-232-37/+19
| | | | | | | | | | | | | | Now that there is no longer any distinction between symbolLo and symbolHi operands in either printing, encoding, or parsing, the operand types can be removed in favor of simply using s16imm. This completes the patch series to decouple lo/hi operand part processing from the particular instruction whose operand it is. No change in code generation expected from this patch. llvm-svn: 182618
* Re-implement DebugIR in a way that does not subclass AssemblyWriter:Daniel Malea2013-05-233-103/+287
| | | | | | | | | | | - move AsmWriter.h from public headers into lib - marked all AssemblyWriter functions as non-virtual; no need to override them - DebugIR now "plugs into" AssemblyWriter with an AssemblyAnnotationWriter helper - exposed flags to control hiding of a) debug metadata b) debug intrinsic calls C/R: Paul Redmond llvm-svn: 182617
* [PowerPC] Clean up generation of ha16() / lo16() markersUlrich Weigand2013-05-2311-108/+257
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When targeting the Darwin assembler, we need to generate markers ha16() and lo16() to designate the high and low parts of a (symbolic) immediate. This is necessary not just for plain symbols, but also for certain symbolic expression, typically along the lines of ha16(A - B). The latter doesn't work when simply using VariantKind flags on the symbol reference. This is why the current back-end uses hacks (explicitly called out as such via multiple FIXMEs) in the symbolLo/symbolHi print methods. This patch uses target-defined MCExpr codes to represent the Darwin ha16/lo16 constructs, following along the lines of the equivalent solution used by the ARM back end to handle their :upper16: / :lower16: markers. This allows us to get rid of special handling both in the symbolLo/symbolHi print method and in the common code MCExpr::print routine. Instead, the ha16 / lo16 markers are printed simply in a custom print routine for the target MCExpr types. (As a result, the symbolLo/symbolHi print methods can now replaced by a single printS16ImmOperand routine that also handles symbolic operands.) The patch also provides a EvaluateAsRelocatableImpl routine to handle ha16/lo16 constructs. This is not actually used at the moment by any in-tree code, but is provided as it makes merging into David Fang's out-of-tree Mach-O object writer simpler. Since there is no longer any need to treat VK_PPC_GAS_HA16 and VK_PPC_DARWIN_HA16 differently, they are merged into a single VK_PPC_ADDR16_HA (and likewise for the _LO16 types). llvm-svn: 182616
* ARM: implement @llvm.readcyclecounter intrinsicTim Northover2013-05-231-1/+43
| | | | | | | | | | | | | This implements the @llvm.readcyclecounter intrinsic as the specific MRC instruction specified in the ARM manuals for CPUs with the Power Management extensions. Older CPUs had slightly different methods which may also have to be implemented eventually, but this should cover all v7 cases. rdar://problem/13939186 llvm-svn: 182603
* ARM: Add Performance Monitor Extensions featureTim Northover2013-05-233-1/+10
| | | | | | | | | | Performance monitors, including a basic cycle counter, are an official extension in the ARMv7 specification. This adds support for enabling and disabling them, orthogonally from CPU selection. rdar://problem/13939186 llvm-svn: 182602
* R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst regTom Stellard2013-05-231-2/+9
| | | | | | | | | Patch by: Vincent Lejeune https://bugs.freedesktop.org/show_bug.cgi?id=64877 NOTE: This is a candidate for the 3.3 branch. llvm-svn: 182600
* Move passes from namespace llvm into anonymous namespaces. Sort includes ↵Benjamin Kramer2013-05-2315-37/+37
| | | | | | while there. llvm-svn: 182594
* Fix PR16110: Handle DBG_VALUE in ConnectedVNInfoEqClasses::Distribute().Jakob Stoklund Olesen2013-05-231-2/+10
| | | | | | | | | | | | Now that the LiveDebugVariables pass is running *after* register coalescing, the ConnectedVNInfoEqClasses class needs to deal with DBG_VALUE instructions. This only comes up when rematerialization during coalescing causes the remaining live range of a virtual register to separate into two connected components. llvm-svn: 182592
* More symbols that should be static.Benjamin Kramer2013-05-232-7/+5
| | | | llvm-svn: 182590
* Hexagon: Make helper functions static.Benjamin Kramer2013-05-232-3/+5
| | | | llvm-svn: 182588
* R600: Hide symbols of implementation details.Benjamin Kramer2013-05-234-63/+25
| | | | | | Also removes an unused function. llvm-svn: 182587
* InlineSpiller: Store bucket pointers instead of iterators.Benjamin Kramer2013-05-231-9/+9
| | | | | | Lets us use a SetVector instead of an explicit set + vector combination. llvm-svn: 182586
* Setting the default value (fixes CRT assertions about uninitialized variable ↵Aaron Ballman2013-05-231-3/+3
| | | | | | use when doing debug MSVC builds), and fixing coding style. llvm-svn: 182585
* Fix 32 bit build in c++11 mode.Rafael Espindola2013-05-231-1/+1
| | | | | | | | The error was: error: non-constant-expression cannot be narrowed from type 'long long' to 'long' in initializer list [-Wc++11-narrowing] MI.getOperand(6).getImm() & 0x1F, llvm-svn: 182584
* Fix a leak on the r600 backend.Rafael Espindola2013-05-232-8/+12
| | | | | | This should bring the valgrind bot back to life. llvm-svn: 182561
* clang-format this file.Rafael Espindola2013-05-231-29/+25
| | | | llvm-svn: 182560
* [objc-arc] Fixed number of prefixing slashes in some comments in a function ↵Michael Gottesman2013-05-231-6/+6
| | | | | | from 3 to 2 to match the rest of ObjCARCOpts. llvm-svn: 182557
* Missed removing one of the assert()'s from the LLVMCreateDisasmCPU() libraryKevin Enderby2013-05-231-1/+2
| | | | | | | | | | API with my 176880 revision. If a bad Triple is passed in it can also assert. In this case too it should just return 0 to indicate failure to create the disassembler. rdar://13955214 llvm-svn: 182542
* Solidify the assumption that a DW_TAG_subprogram's type is a ↵David Blaikie2013-05-222-21/+18
| | | | | | | | | | | | | DW_TAG_subroutine_type There were bits & pieces of code lying around that may've given the impression that debug info metadata supported the possibility that a subprogram's type could be specified by a non-subroutine type describing the return type of a void function. This support was incomplete & unnecessary. Asserts & API have been changed to make the desired usage more clear. llvm-svn: 182532
* Simplify logic now that r182490 is in place. No functional change intended.Chad Rosier2013-05-2211-50/+46
| | | | llvm-svn: 182531
* Simplify logic now that r182490 is in place. No functional change intended.Chad Rosier2013-05-221-8/+4
| | | | llvm-svn: 182527
* Simplify logic now that r182490 is in place. No functional change intended.Chad Rosier2013-05-221-15/+14
| | | | llvm-svn: 182526
* Change some PowerPC PatLeaf definitions to ImmLeaf for fast-isel.Bill Schmidt2013-05-222-17/+19
| | | | | | | | | | | | | | | | | Using PatLeaf rather than ImmLeaf when defining immediate predicates prevents simple patterns using those predicates from being recognized for fast instruction selection. This patch replaces the immSExt16 PatLeaf predicate with two ImmLeaf predicates, imm32SExt16 and imm64SExt16, allowing a few more patterns to be recognized (ADDI, ADDIC, MULLI, ADDI8, and ADDIC8). Using the new predicates does not help for LI, LI8, SUBFIC, and SUBFIC8 because these are rejected for other reasons, but I see no reason to retain the PatLeaf predicate. No functional change intended, and thus no test cases yet. This is preliminary work for enabling fast-isel support for PowerPC. When that support is ready, we'll be able to test this function. llvm-svn: 182510
* SLPVectorizer: Change the order in which new instructions are added to the ↵Nadav Rotem2013-05-223-57/+132
| | | | | | | | | | | | function. We are not working on a DAG and I ran into a number of problems when I enabled the vectorizations of 'diamond-trees' (trees that share leafs). * Imroved the numbering API. * Changed the placement of new instructions to the last root. * Fixed a bug with external tree users with non-zero lane. * Fixed a bug in the placement of in-tree users. llvm-svn: 182508
* X86: Fix a bug in EltsFromConsecutiveLoads. We can't generate new loads ↵Nadav Rotem2013-05-221-8/+20
| | | | | | without chains. llvm-svn: 182507
* This is an update to a previous commit (r181216).Jean-Luc Duprat2013-05-222-29/+43
| | | | | | | | | | | | | | | | | | | | | | | The earlier change list introduced the following inst combines: B * (uitofp i1 C) —> select C, B, 0 A * (1 - uitofp i1 C) —> select C, 0, A select C, 0, B + select C, A, 0 —> select C, A, B Together these 3 changes would simplify : A * (1 - uitofp i1 C) + B * uitofp i1 C down to : select C, B, A In practice we found that the first two substitutions can have a negative effect on performance, because they reduce opportunities to use FMA contractions; between the two options FMAs are often the better choice. This change list amends the previous one to enable just these inst combines: select C, B, 0 + select C, 0, A —> select C, B, A A * (1 - uitofp i1 C) + B * uitofp i1 C —> select C, B, A llvm-svn: 182499
* Unify formatting of debug output.Adrian Prantl2013-05-222-3/+3
| | | | llvm-svn: 182495
* X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower ↵Benjamin Kramer2013-05-221-4/+11
| | | | | | | | halves as unsigned. Take #2 on fixing PR15977. llvm-svn: 182486
* LoopVectorize: Make Value pointers that could be RAUW'ed a VHArnold Schwaighofer2013-05-221-3/+4
| | | | | | | | | | The Value pointers we store in the induction variable list can be RAUW'ed by a call to SCEVExpander::expandCodeFor, use a TrackingVH instead. Do the same thing in some other places where we store pointers that could potentially be RAUW'ed. Fixes PR16073. llvm-svn: 182485
* Fix use after free (pr16103).Rafael Espindola2013-05-221-7/+22
| | | | llvm-svn: 182482
* Check that a function starts with llvm. before using GET_FUNCTION_RECOGNIZER.Rafael Espindola2013-05-222-3/+5
| | | | | | Fixes a use of uninitialized memory found by asan and valgind. llvm-svn: 182480
* [SystemZ] Rename PSW to CCRichard Sandiford2013-05-227-43/+37
| | | | | | | | | | Addresses a review comment from Ulrich Weigand. No functional change intended. I'm not sure whether the old TODO that this patch touches still holds, but that's something we'd get to when adding a targetted scheduling description. llvm-svn: 182474
* [SystemZ] Fix thinko in long branch passRichard Sandiford2013-05-221-26/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original version of the pass could underestimate the length of a backward branch in cases like: alignment to N bytes or more ... relaxable branch A ... foo: (aligned to M<N bytes) ... bar: (aligned to N bytes) ... relaxable branch B to foo We don't add any misalignment gap for "bar" because N bytes of alignment had already been reached earlier in the function. In this case, assuming that A is relaxed can push "foo" closer to "bar", and make B appear to be in range. Similar problems can occur for forward branches. I don't think it's possible to create blocks with mixed alignments as things stand, not least because we haven't yet defined getPrefLoopAlignment() for SystemZ (that would need benchmarking). So I don't think we can test this yet. Thanks to Rafael Espíndola for spotting the bug. llvm-svn: 182460
OpenPOWER on IntegriCloud