summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Make helper functions static.Benjamin Kramer2014-04-272-3/+3
| | | | llvm-svn: 207359
* Remove redundant explicit default initialization of non-trivially ↵David Blaikie2014-04-271-1/+1
| | | | | | constructed member. llvm-svn: 207357
* Add the default constructor DwarfAccelTable::DataArray() to initialize ↵NAKAMURA Takumi2014-04-271-0/+1
| | | | | | | | (MCSymbol*)StrSym explicitly. It will fix crash in codegen on msvc x64. llvm-svn: 207356
* SelectionDAG: Aggressively fold shuffles of constant splats.Benjamin Kramer2014-04-271-0/+5
| | | | llvm-svn: 207352
* ARM: MSVC does not support = defaultSaleem Abdulrasool2014-04-271-1/+1
| | | | | | | Explicitly "implement" the destructor as MSVC does not support defaulted methods yet. llvm-svn: 207350
* MC: restore behaviour of defaulting to ELFSaleem Abdulrasool2014-04-271-4/+3
| | | | | | | This restores the previous behaviour of just assuming that if you dont specify a valid triple that you really meant the default triple with an ELF object file. llvm-svn: 207349
* Add WoA object file emission supportSaleem Abdulrasool2014-04-279-19/+205
| | | | | | | | | | | | | | | | | | | | | | Introduce support for WoA PE/COFF object file emission from LLVM. Add the new target specific PE/COFF Streamer (ARMWinCOFFStreamer) that handles the ARM specific behaviour of PE/COFF object emission. ARM exception information is not yet emitted and is a TODO item. The ARM specific object writer (ARMWinCOFFObjectWriter) handles the ARM specific relocation handling in conjunction with the WinCOFFObjectWriter in the MC layer. The MC layer needs to be updated to deal with the relocation adjustments. Branch relocations are adjusted by 4 bytes (unlikely their ELF counterparts). Minor tweaks to switch multiple conditional checks into equivalent switch statements. The ObjectFileInfo is updated to relax the object file setup for Windows COFF. Move the architecture checks into an assertion. Windows COFF is currently only supported on x86, x86_64, and ARM (thumb). Rather than defaulting to ELF, we will refuse to generate an object file. This is better though as you do not get an (arbitrary) object file which is different from the request. llvm-svn: 207345
* MC: create X86WinCOFFStreamer for target specific behaviourSaleem Abdulrasool2014-04-275-16/+63
| | | | | | | | | | This introduces a target specific streamer, X86WinCOFFStreamer, which handles the target specific behaviour (e.g. WinEH). This is mostly to ensure that differences between ARM and X86 remain disjoint and do not accidentally cross boundaries. This is the final staging change for enabling object emission for Windows on ARM. llvm-svn: 207344
* MC: rename WinCOFFStreamer and move declaration out-of-lineSaleem Abdulrasool2014-04-271-90/+57
| | | | | | | | | This is in preparation for promoting WinCOFFStreamer to a base class which will be shared by the X86 and ARM specific target COFF streamers. Also add a new getOrCreateSymbolData interface (like MCELFStreamer) for the ARM COFF Streamer. This makes the COFFStreamer more similar to the ELFStreamer. llvm-svn: 207343
* MC: style tweaks to WinCOFFStreamerSaleem Abdulrasool2014-04-271-65/+57
| | | | | | | Stylistic changes to prepare for splitting up the COFFStreamer into target specific streamers. Tweak some assertion messages. No functional change. llvm-svn: 207342
* ARM: Support SingleParameterDotFile on WoASaleem Abdulrasool2014-04-271-0/+1
| | | | | | | | | Currently, the integrated assembler is the only choice for assembling Windows on ARM binaries. IAS supports the .file <filename> directive which emits the file symbol into the resulting object binary. Mark the GNU COFF information to indicate support for this feature. llvm-svn: 207341
* [LCG] Re-organize the methods for mutating a call graph to make theirChandler Carruth2014-04-271-76/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | API requirements much more obvious. The key here is that there are two totally different use cases for mutating the graph. Prior to doing any SCC formation, it is very easy to mutate the graph. There may be users that want to do small tweaks here, and then use the already-built graph for their SCC-based operations. This method remains on the graph itself and is documented carefully as being cheap but unavailable once SCCs are formed. Once SCCs are formed, and there is some in-flight DFS building them, we have to be much more careful in how we mutate the graph. These mutation operations are sunk onto the SCCs themselves, which both simplifies things (the code was already there!) and helps make it obvious that these interfaces are only applicable within that context. The other primary constraint is that the edge being mutated is actually related to the SCC on which we call the method. This helps make it obvious that you cannot arbitrarily mutate some other SCC. I've tried to write much more complete documentation for the interesting mutation API -- intra-SCC edge removal. Currently one aspect of this documentation is a lie (the result list of SCCs) but we also don't even have tests for that API. =[ I'm going to add tests and fix it to match the documentation next. llvm-svn: 207339
* DAGCombiner: Simplify code a bit, make more transforms work with vectors.Benjamin Kramer2014-04-261-58/+37
| | | | llvm-svn: 207338
* DwarfDebug: Roll argument into call.David Blaikie2014-04-261-10/+6
| | | | llvm-svn: 207334
* DebugInfo: Fix and test a regression caused by r207263 causing the ↵David Blaikie2014-04-261-1/+1
| | | | | | | | DW_AT_object_pointer to go missing on blocks Noticed by inspection. Test coverage added. llvm-svn: 207333
* Replace std::vector with SmallVector for some small, known size vectors.Craig Topper2014-04-261-4/+4
| | | | llvm-svn: 207330
* Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer ↵Craig Topper2014-04-2611-83/+62
| | | | | | and size. llvm-svn: 207329
* Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.Craig Topper2014-04-2631-595/+437
| | | | llvm-svn: 207327
* Remove an unused version of getMemIntrinsicNode and getNode. Additionally, ↵Craig Topper2014-04-261-20/+0
| | | | | | these were calling makeVTList with the pointers passed in which would were unlikely to belong to SelectionDAG and likely would have just been stack pointers. llvm-svn: 207326
* DWARF Type Units: Avoid emitting type units under fission if the type ↵David Blaikie2014-04-265-7/+54
| | | | | | | | | | | | requires an address. Since there's no way to ensure the type unit in the .dwo and the type unit skeleton in the .o are correlated, this cannot work. This implementation is a bit inefficient for a few reasons, called out in comments. llvm-svn: 207323
* Print X86ISD::PMULDQ nodes properly in debug output.Benjamin Kramer2014-04-261-0/+1
| | | | llvm-svn: 207322
* DwarfDebug: Minor refactoring around type unit constructionDavid Blaikie2014-04-262-16/+21
| | | | | | | | | | | | | Sinking addition of the declaration attribute down to where the signature is added. So that if the signature is not added neither is the declaration attribute (this will come in handy when aborting type unit construction to instead emit the type into the CU directly in some cases) Pull out type unit identifier hashing just to simplify the function a little, it'll be getting longer. llvm-svn: 207321
* X86TTI: i16/i32 vector div with a constant (splat) divisor are reasonably ↵Benjamin Kramer2014-04-261-0/+19
| | | | | | | | cheap now. Turn vectorization back on. llvm-svn: 207320
* X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available.Benjamin Kramer2014-04-264-14/+56
| | | | llvm-svn: 207318
* X86: Add patterns for MULHU/MULHS of v8i16 and v16i16.Benjamin Kramer2014-04-262-4/+18
| | | | | | | This gets us pretty code for divs of i16 vectors. Turn the existing intrinsics into the corresponding nodes. llvm-svn: 207317
* Rip out X86-specific vector SDIV lowering, make the corresponding ↵Benjamin Kramer2014-04-262-77/+24
| | | | | | DAGCombiner transform work on vectors. llvm-svn: 207316
* DAGCombiner: Turn divs of vector splats into vectorized multiplications.Benjamin Kramer2014-04-264-24/+64
| | | | | | | | | | | | Otherwise the legalizer would just scalarize everything. Support for mulhi in the targets isn't that great yet so on most targets we get exactly the same scalarized output. Add a test for x86 vector udiv. I had to disable the mulhi nodes on ARM because there aren't any patterns for it. As far as I know ARM has instructions for getting the high part of a multiply so this should be fixed. llvm-svn: 207315
* X86: Custom lower v4i32 UMUL_LOHI into 2 pmuludqs.Benjamin Kramer2014-04-261-0/+37
| | | | | | Test will follow soon. llvm-svn: 207314
* Revert r206749 till a final decision about the intrinsics is made.Michael Zolotukhin2014-04-263-239/+0
| | | | llvm-svn: 207313
* [LCG] Rather than removing nodes from the SCC entry set when we processChandler Carruth2014-04-261-6/+7
| | | | | | | | | | them, just skip over any DFS-numbered nodes when finding the next root of a DFS. This allows the entry set to just be a vector as we populate it from a uniqued source. It also removes the possibility for a linear scan of the entry set to actually do the removal which can make things go quadratic if we get unlucky. llvm-svn: 207312
* [LCG] Rotate the full SCC finding algorithm to avoid round-trips throughChandler Carruth2014-04-261-21/+23
| | | | | | | | | | | | | | | the DFS stack for leaves in the call graph. As mentioned in my previous commit, this is particularly interesting for graphs which have high fan out but low connectivity resulting in many leaves. For such graphs, this can remove a large % of the DFS stack traffic even though it doesn't make the stack much smaller. It's a bit easier to formulate this for the full algorithm because that one stops completely for each SCC. For example, I was able to directly eliminate the "Recurse" boolean used to continue an outer loop from the inner loop. llvm-svn: 207311
* [LCG] Hoist the main DFS loop out of the edge removal function. ThisChandler Carruth2014-04-261-74/+70
| | | | | | | | | makes working through the worklist much cleaner, and makes it possible to avoid the 'bool-to-continue-the-outer-loop' hack. Not a huge difference, but I think this is approaching as polished as I can make it. llvm-svn: 207310
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-262-2/+18
| | | | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 Repaired r207302. llvm-svn: 207309
* Restore CloneFunction.cpp which got accidentlyGerolf Hoflehner2014-04-261-92/+33
| | | | | | overwritten by previous backout of r207303 llvm-svn: 207308
* [LCG] In the incremental SCC re-formation, lift the node currently beingChandler Carruth2014-04-261-30/+38
| | | | | | | | | | | | | | | | | | | processed in the DFS out of the stack completely. Keep it exclusively in a variable. Re-shuffle some code structure to make this easier. This can have a very dramatic effect in some cases because call graphs tend to look like a high fan-out spanning tree. As a consequence, there are a large number of leaf nodes in the graph, and this technique causes leaf nodes to never even go into the stack. While this only reduces the max depth by 1, it may cause the total number of round trips through the stack to drop by a lot. Now, most of this isn't really relevant for the incremental version. =] But I wanted to prototype it first here as this variant is in ways more complex. As long as I can get the code factored well here, I'll next make the primary walk look the same. There are several refactorings this exposes I think. llvm-svn: 207306
* [LCG] Special case the removal of self edges. These don't impact the SCCChandler Carruth2014-04-261-0/+6
| | | | | | | | graph in any way because we don't track edges in the SCC graph, just nodes. This also lets us add a nice assert about the invariant that we're working on at least a certain number of nodes within the SCC. llvm-svn: 207305
* [DAG] During DAG legalization keep opaque constants even after expanding.Juergen Ributzka2014-04-261-3/+8
| | | | | | | | | | | | | | | | | | | | | The included test case would return the incorrect results, because the expansion of an shift with a constant shift amount of 0 would generate undefined behavior. This is because ExpandShiftByConstant assumes that all shifts by constants with a value of 0 have already been optimized away. This doesn't happen for opaque constants and usually this isn't a problem, because opaque constants won't take this code path - they are not supposed to. In the case that the opaque constant has to be expanded by the legalizer, the legalizer would drop the opaque flag. In this case we hit the limitations of ExpandShiftByConstant and create incorrect code. This commit fixes the legalizer by not dropping the opaque flag when expanding opaque constants and adding an assertion to ExpandShiftByConstant to catch this not supported case in the future. This fixes <rdar://problem/16718472> llvm-svn: 207304
* Revert commit r207302 since build failuresGerolf Hoflehner2014-04-263-51/+94
| | | | | | have been reported. llvm-svn: 207303
* RecursivelyDeleteTriviallyDeadInstructions() could removeGerolf Hoflehner2014-04-262-2/+18
| | | | | | | | | more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 llvm-svn: 207302
* [X86] Implement TargetLowering::getScalingFactorCost hook.Quentin Colombet2014-04-262-0/+19
| | | | | | | | | Scaling factors are not free on X86 because every "complex" addressing mode breaks the related instruction into 2 allocations instead of 1. <rdar://problem/16730541> llvm-svn: 207301
* [LCG] Refactor the duplicated code I added in my last commit here intoChandler Carruth2014-04-261-23/+14
| | | | | | | a helper function. Also factor the other two places where we did the same thing into the helper function. =] Much cleaner this way. NFC. llvm-svn: 207300
* [InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shiftAndrea Di Biagio2014-04-261-9/+41
| | | | | | | | | | right intrinsics. A packed logical shift right with a shift count bigger than or equal to the element size always produces a zero vector. In all other cases, it can be safely replaced by a 'lshr' instruction. llvm-svn: 207299
* Add missing include guards and missing #include, found by modules build.Richard Smith2014-04-261-0/+6
| | | | llvm-svn: 207298
* Optimization for certain shufflevector by using insertps.Filipe Cabecinhas2014-04-251-0/+104
| | | | | | | | | | | | | | | | | | | | Summary: If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower certain shufflevectors to an insertps instruction: When most of the shufflevector result's elements come from one vector (and keep their index), and one element comes from another vector or a memory operand. Added tests for insertps optimizations on shufflevector. Added support and tests for v4i32 vector optimization. Reviewers: nadav Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3475 llvm-svn: 207291
* Revert "blockfreq: Approximate irreducible control flow"Duncan P. N. Exon Smith2014-04-251-210/+20
| | | | | | | | | | | | This reverts commit r207286. It causes an ICE on the cmake-llvm-x86_64-linux buildbot [1]: llvm/lib/Analysis/BlockFrequencyInfo.cpp: In lambda function: llvm/lib/Analysis/BlockFrequencyInfo.cpp:182:1: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035 [1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/12093/steps/build_llvm/logs/stdio llvm-svn: 207287
* blockfreq: Approximate irreducible control flowDuncan P. N. Exon Smith2014-04-251-20/+210
| | | | | | | | | | | | | | | | | | | | | | Previously, irreducible backedges were ignored. With this commit, irreducible SCCs are discovered on the fly, and modelled as loops with multiple headers. This approximation specifies the headers of irreducible sub-SCCs as its entry blocks and all nodes that are targets of a backedge within it (excluding backedges within true sub-loops). Block frequency calculations act as if we insert a new block that intercepts all the edges to the headers. All backedges and entries to the irreducible SCC point to this imaginary block. This imaginary block has an edge (with even probability) to each header block. The result is now reasonable enough that I've added a number of testcases for irreducible control flow. I've outlined in `BlockFrequencyInfoImpl.h` ways to improve the approximation. <rdar://problem/14292693> llvm-svn: 207286
* Unbreak the gdb buildbot by not lowering dbg.declare intrinsics for arrays.Adrian Prantl2014-04-251-1/+7
| | | | llvm-svn: 207284
* Make sure that rangelists are also relative to the compile unitEric Christopher2014-04-251-2/+9
| | | | | | | | low_pc similar to location lists. Fixes PR19563 llvm-svn: 207283
* R600: Fix function name printing in LowerCallMatt Arsenault2014-04-251-1/+3
| | | | | | | | v2: Check both ExternalSymbol and GlobalAddress Patch by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 207282
* DwarfAccelTable: Store the string symbol in the accelerator table to avoid ↵David Blaikie2014-04-253-35/+39
| | | | | | | | | | duplicate lookup. This also avoids the need for subtly side-effecting calls to manifest strings in the string table at the point where items are added to the accelerator tables. llvm-svn: 207281
OpenPOWER on IntegriCloud