summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Add back a line I deleted by accident in r145141. Fixes uninitialized ↵Eli Friedman2011-11-281-0/+1
| | | | | | variable warnings and runtime failures. llvm-svn: 145256
* Silence wrong warnings from GCC about variables possibly being usedDuncan Sands2011-11-281-2/+2
| | | | | | | uninitialized: GCC doesn't understand that the variables are only used if !UseImm, in which case they have been initialized. llvm-svn: 145239
* Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge ↵Craig Topper2011-11-284-110/+117
| | | | | | VPERMILPS/VPERMILPD detection since they are pretty similar. llvm-svn: 145238
* Make isCommutedVSHUFP more like the way isCommutedSHUFP is handled.Craig Topper2011-11-281-35/+81
| | | | llvm-svn: 145218
* rename ENABLE_THREADS to LLVM_ENABLE_THREADSDylan Noblesmith2011-11-284-7/+7
| | | | | | | | | | | Now that it needs to be exported in a public header (Valgrind.h) it should be prefixed to avoid collision with other projects. Add it to llvm-config.h as well. This'll require regenerating the configure script after this commit, but I don't have the required autoconf version. llvm-svn: 145214
* Place the "cfg checksum" around a test. This was recently added in April 2011 toNick Lewycky2011-11-271-67/+67
| | | | | | | | | | | | | gcc, though I thought it was older (my gcc 4.4 has it as a local patch. Whoops!) This fixes PR10589. Also add some debugging statements. Remove GcnoFiles, the mapping from CompilationUnit to raw_ostream. Now that we start by iterating over each CU and descending into them, there's no need to maintain a mapping. llvm-svn: 145208
* Merge detecting and handling for VSHUFPSY and VSHUFPDY since a lot of the ↵Craig Topper2011-11-271-92/+39
| | | | | | code was similar for both. llvm-svn: 145199
* Prevent rotating the blocks of a loop (and thus getting a backedge to beChandler Carruth2011-11-271-0/+16
| | | | | | | | | | | | | | fallthrough) in cases where we might fail to rotate an exit to an outer loop onto the end of the loop chain. Having *some* rotation, but not performing this rotation, is the primary fix of thep performance regression with -enable-block-placement for Olden/em3d (a whopping 30% regression). Still working on reducing the test case that actually exercises this and the new rotation strategy out of this code, but I want to check if this regresses other test cases first as that may indicate it isn't the correct fix. llvm-svn: 145195
* Take two on rotating the block ordering of loops. My previous attemptChandler Carruth2011-11-271-85/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | was centered around the premise of laying out a loop in a chain, and then rotating that chain. This is good for preserving contiguous layout, but bad for actually making sane rotations. In order to keep it safe, I had to essentially make it impossible to rotate deeply nested loops. The information needed to correctly reason about a deeply nested loop is actually available -- *before* we layout the loop. We know the inner loops are already fused into chains, etc. We lose information the moment we actually lay out the loop. The solution was the other alternative for this algorithm I discussed with Benjamin and some others: rather than rotating the loop after-the-fact, try to pick a profitable starting block for the loop's layout, and then use our existing layout logic. I was worried about the complexity of this "pick" step, but it turns out such complexity is needed to handle all the important cases I keep teasing out of benchmarks. This is, I'm afraid, a bit of a work-in-progress. It is still misbehaving on some likely important cases I'm investigating in Olden. It also isn't really tested. I'm going to try to craft some interesting nested-loop test cases, but it's likely to be extremely time consuming and I don't want to go there until I'm sure I'm testing the correct behavior. Sadly I can't come up with a way of getting simple, fine grained test cases for this logic. We need complex loop structures to even trigger much of it. llvm-svn: 145183
* Revert r145180 as it is causing test failures on all the bots.Chandler Carruth2011-11-274-126/+32
| | | | | | | | | | | | | Original commit message: Fixed ObjectFile functions: - getSymbolOffset() renamed as getSymbolFileOffset() - getSymbolFileOffset(), getSymbolAddress(), getRelocationAddress() returns same result for ELFObjectFile, MachOObjectFile and COFFObjectFile. - added getRelocationOffset() - fixed MachOObjectFile::getSymbolSize() - fixed MachOObjectFile::getSymbolSection() - fixed MachOObjectFile::getSymbolOffset() for symbols without section data. llvm-svn: 145182
* Fix an impressive type-o / spell-o Duncan noticed.Chandler Carruth2011-11-271-1/+1
| | | | llvm-svn: 145181
* Fixed ObjectFile functions:Danil Malyshev2011-11-274-32/+126
| | | | | | | | | | | - getSymbolOffset() renamed as getSymbolFileOffset() - getSymbolFileOffset(), getSymbolAddress(), getRelocationAddress() returns same result for ELFObjectFile, MachOObjectFile and COFFObjectFile. - added getRelocationOffset() - fixed MachOObjectFile::getSymbolSize() - fixed MachOObjectFile::getSymbolSection() - fixed MachOObjectFile::getSymbolOffset() for symbols without section data. llvm-svn: 145180
* Rework a bit of the implementation of loop block rotation to not rely soChandler Carruth2011-11-271-21/+31
| | | | | | | | | | | | | | | heavily on AnalyzeBranch. That routine doesn't behave as we want given that rotation occurs mid-way through re-ordering the function. Instead merely check that there are not unanalyzable branching constructs present, and then reason about the CFG via successor lists. This actually simplifies my mental model for all of this as well. The concrete result is that we now will rotate more loop chains. I've added a test case from Olden highlighting the effect. There is still a bit more to do here though in order to regain all of the performance in Olden. llvm-svn: 145179
* Eli managed to kill off llvm.membarrier in llvm 3.0 also, this meansChris Lattner2011-11-271-34/+8
| | | | | | that mainline needs no autoupgrade logic for intrinsics yet, woohoo! llvm-svn: 145178
* The llvm.atomic intrinsics *were* removed in LLVM 3.0 (in r141333), remove the Chris Lattner2011-11-271-68/+1
| | | | | | autoupgrade logic for 2.9 and before. llvm-svn: 145176
* remove autoupgrade support for old forms of llvm.prefetch and the oldChris Lattner2011-11-271-101/+1
| | | | | | | trampoline forms. Both of these were correct in LLVM 3.0, and we don't need to support LLVM 2.9 and earlier in mainline. llvm-svn: 145174
* remove asmparsing and documentation support for "volatile load", which was ↵Chris Lattner2011-11-272-28/+8
| | | | | | only produced by LLVM 2.9 and earlier. LLVM 3.0 and later prefers "load volatile". llvm-svn: 145172
* remove autoupgrade support for really old-style debug info intrinsics.Chris Lattner2011-11-273-47/+0
| | | | | | | I think this is the last of autoupgrade that can be removed in 3.1. Can the atomic upgrade stuff also go? llvm-svn: 145169
* remove some old autoupgrade logicChris Lattner2011-11-271-80/+1
| | | | llvm-svn: 145167
* remove autoupgrade support for LLVM 2.9 exception stuff. Mainline supportsChris Lattner2011-11-273-252/+0
| | | | | | LLVM 3.0 and later. llvm-svn: 145165
* remove support for reading llvm 2.9 .bc files. LLVM 3.1 is only compatible ↵Chris Lattner2011-11-272-277/+0
| | | | | | back to 3.0 llvm-svn: 145164
* Add several new instructions supported by the latest MicroBlaze.Wesley Peck2011-11-274-1/+53
| | | | | | These instructions are not generated by the backend yet, this will come in a later commit. llvm-svn: 145161
* Optimize comparison against 0 in conditional instructions.Wesley Peck2011-11-271-2/+156
| | | | | | Fix a couple of 80-column violations. llvm-svn: 145159
* Introduce a loop block rotation optimization to the new block placementChandler Carruth2011-11-271-3/+92
| | | | | | | | | | | | | | | pass. This is designed to achieve one of the important optimizations that the old code placement pass did, but more simply. This is a somewhat rough and *very* conservative version of the transform. We could get a lot fancier here if there are profitable cases to do so. In particular, this only looks for a single pattern, it insists that the loop backedge being rotated away is the last backedge in the chain, and it doesn't provide any means of doing better in-loop placement due to the rotation. However, it appears that it will handle the important loops I am finding in the LLVM test suite. llvm-svn: 145158
* Move code into anonymous namespaces.Benjamin Kramer2011-11-269-37/+35
| | | | llvm-svn: 145154
* Merge 128-bit and 256-bit X86ISD node types for VPERMILPS and VPERMILPD. ↵Craig Topper2011-11-264-39/+15
| | | | | | Simplify some shuffle lowering code since V1 can never be UNDEF due to canonalizing that occurs when shuffle nodes are created. llvm-svn: 145153
* Rename a couple of options and fix some simple typos.Wesley Peck2011-11-264-6/+6
| | | | llvm-svn: 145152
* Collapse X86ISD node types for PUNPCKH*, PUNPCKL*, UNPCKLP*, and UNPCKHP* to ↵Craig Topper2011-11-264-178/+116
| | | | | | not be type specific. Now we just have integer high and low and floating point high and low. Pattern matching will choose the correct instruction based on the vector type. llvm-svn: 145148
* Fix APFloat::convert so that it handles narrowing conversions correctly; itEli Friedman2011-11-261-49/+36
| | | | | | | | was returning incorrect values in rare cases, and incorrectly marking exact conversions as inexact in some more common cases. Fixes PR11406, and a missed optimization in test/CodeGen/X86/fp-stack-O0.ll. llvm-svn: 145141
* This patch contains support for encoding FMA4 instructions andBruno Cardoso Lopes2011-11-254-7/+86
| | | | | | | | | tablegen patterns for scalar FMA4 operations and intrinsic. Also add tests for vfmaddsd. Patch by Jan Sjodin llvm-svn: 145133
* ARMLoadStoreOptimizer.cpp: Fix MSVC(Debug) build.NAKAMURA Takumi2011-11-251-0/+1
| | | | llvm-svn: 145129
* Remove 256-bit specific node types for UNPCKHPS/D and instead use the ↵Craig Topper2011-11-244-50/+24
| | | | | | 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64. llvm-svn: 145126
* Remove AVX2 specific X86ISD node types for PUNPCKH/L and instead just reuse ↵Craig Topper2011-11-244-81/+33
| | | | | | the 128-bit versions and let the vector type distinguish. llvm-svn: 145125
* Devirtualize Pass::getPassID, overriding it isn't useful and it gets called ↵Benjamin Kramer2011-11-241-2/+0
| | | | | | | | a lot. While at it pull the trivial ctor in line. llvm-svn: 145124
* Make ConstantRange::truncate a bit more efficient.Benjamin Kramer2011-11-241-4/+2
| | | | llvm-svn: 145122
* X86: alias cqo to cqto.Benjamin Kramer2011-11-241-1/+2
| | | | llvm-svn: 145121
* Fix a silly use-after-free issue. A much earlier version of this codeChandler Carruth2011-11-241-2/+2
| | | | | | | | | | | | need lots of fanciness around retaining a reference to a Chain's slot in the BlockToChain map, but that's all gone now. We can just go directly to allocating the new chain (which will update the mapping for us) and using it. Somewhat gross mechanically generated test case replicates the issue Duncan spotted when actually testing this out. llvm-svn: 145120
* When adding blocks to the list of those which no longer have any CFGChandler Carruth2011-11-241-3/+3
| | | | | | | | | | | | | | conflicts, we should only be adding the first block of the chain to the list, lest we try to merge into the middle of that chain. Most of the places we were doing this we already happened to be looking at the first block, but there is no reason to assume that, and in some cases it was clearly wrong. I've added a couple of tests here. One already worked, but I like having an explicit test for it. The other is reduced from a test case Duncan reduced for me and used to crash. Now it is handled correctly. llvm-svn: 145119
* This patch makes the following changes necessary for MIPS' direct code emission.Akira Hatanaka2011-11-236-55/+236
| | | | | | | | - lower unaligned loads/stores. - encode the size operand of instructions INS and EXT. - emit relocation information needed for JAL (jump-and-link). llvm-svn: 145113
* This patch addresses gp relative fixups/relocations for jump tables.Akira Hatanaka2011-11-235-7/+38
| | | | llvm-svn: 145112
* Correctly byte-swap APInts with bit-widths greater than 64.Richard Smith2011-11-231-17/+26
| | | | llvm-svn: 145111
* Validate the return type when checking if a function is malloc.Benjamin Kramer2011-11-231-4/+4
| | | | | | Fixes PR11426. Not sure if a test case with a "wrong" malloc would be useful. llvm-svn: 145106
* Fix a crash in which a multiplication was being reported as being both negativeDuncan Sands2011-11-231-2/+7
| | | | | | | | and positive: positive, because it could be directly computed to be positive; negative, because the nsw flags means it is either negative or undefined (the multiplication always overflowed). llvm-svn: 145104
* X86: Use btq for bit tests if the immediate can't be encoded in 32 bits.Benjamin Kramer2011-11-231-1/+9
| | | | | | | | | | | | | | | | Before: movabsq $4294967296, %rax ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00] testq %rax, %rdi ## encoding: [0x48,0x85,0xf8] jne LBB0_2 ## encoding: [0x75,A] After: btq $32, %rdi ## encoding: [0x48,0x0f,0xba,0xe7,0x20] jb LBB0_2 ## encoding: [0x72,A] btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off saving one register and a giant movabsq. llvm-svn: 145103
* Relax an invariant that block placement was trying to assert a bitChandler Carruth2011-11-231-3/+1
| | | | | | | | | | | further. This invariant just wasn't going to work in the face of unanalyzable branches; we need to be resillient to the phenomenon of chains poking into a loop and poking out of a loop. In fact, we already were, we just needed to not assert on it. This was found during a bootstrap with block placement turned on. llvm-svn: 145100
* I added several lines in X86 code generator that allow to choose Elena Demikhovsky2011-11-231-15/+46
| | | | | | | | VSHUFPS/VSHUFPD instructions while lowering VECTOR_SHUFFLE node. I check a commuted VSHUFP mask. The patch was reviewed by Bruno. llvm-svn: 145099
* Handle the case of a no-return invoke correctly. It actually still hasChandler Carruth2011-11-231-0/+8
| | | | | | | | | successors, they just are all landing pad successors. We handle this the same way as no successors. Comments attached for the next person to wade through here and another lovely test case courtesy of Benjamin Kramer's bugpoint reduction. llvm-svn: 145098
* Enable stack protectors for all arrays, not just char arrays. rdar://5875909Bob Wilson2011-11-231-6/+1
| | | | | | Patch by Bill Wendling. llvm-svn: 145097
* Fix PR11422.Jakob Stoklund Olesen2011-11-232-3/+10
| | | | | | | | | | | | | This was a bug in keeping track of the available domains when merging domain values. The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr to the integer domain which is only available in AVX2. Also add an assertion to catch future attempts at emitting AVX2 instructions. llvm-svn: 145096
* Fix a crash in block placement due to an inner loop that happened to beChandler Carruth2011-11-231-1/+4
| | | | | | | | | | | reversed in the function's original ordering, and we happened to encounter it while handling an outer unnatural CFG structure. Thanks to the test case reduced from GCC's source by Benjamin Kramer. This may also fix a crasher in gzip that Duncan reduced for me, but I haven't yet gotten to testing that one. llvm-svn: 145094
OpenPOWER on IntegriCloud