summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* blockfreq: Only one mass distribution per nodeDuncan P. N. Exon Smith2014-04-251-61/+11
| | | | | | | | | | Remove the concepts of "forward" and "general" mass distributions, which was wrong. The split might have made sense in an early version of the algorithm, but it's definitely wrong now. <rdar://problem/14292693> llvm-svn: 207195
* blockfreq: Document assertionDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | <rdar://problem/14292693> llvm-svn: 207194
* blockfreq: Document high-level functionsDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | <rdar://problem/14292693> llvm-svn: 207191
* blockfreq: Scale LoopData::Scale on the way downDuncan P. N. Exon Smith2014-04-251-23/+12
| | | | | | | | | | | Rather than scaling loop headers and then scaling all the loop members by the header frequency, scale `LoopData::Scale` itself, and scale the loop members by it. It's much more obvious what's going on this way, and doesn't cost any extra multiplies. <rdar://problem/14292693> llvm-svn: 207189
* blockfreq: unwrapLoopPackage() => unwrapLoop()Duncan P. N. Exon Smith2014-04-251-10/+8
| | | | | | <rdar://problem/14292693> llvm-svn: 207188
* blockfreq: Pass the Loop directly into unwrapLoopPackage()Duncan P. N. Exon Smith2014-04-251-6/+4
| | | | | | <rdar://problem/14292693> llvm-svn: 207187
* blockfreq: Unwrap from LoopsDuncan P. N. Exon Smith2014-04-251-4/+2
| | | | | | | | When unwrapping loops, just visit the loops rather than all nodes. <rdar://problem/14292693> llvm-svn: 207186
* blockfreq: Separate unwrapLoops() from finalizeMetrics()Duncan P. N. Exon Smith2014-04-251-5/+9
| | | | | | <rdar://problem/14292693> llvm-svn: 207185
* blockfreq: Expose getPackagedNode()Duncan P. N. Exon Smith2014-04-251-24/+1
| | | | | | | | | Make `getPackagedNode()` a member function of `BlockFrequencyInfoImplBase` so that it's available for templated code. <rdar://problem/14292693> llvm-svn: 207183
* blockfreq: Store the header with the membersDuncan P. N. Exon Smith2014-04-251-2/+2
| | | | | | <rdar://problem/14292693> llvm-svn: 207182
* blockfreq: Encapsulate LoopData::HeaderDuncan P. N. Exon Smith2014-04-251-14/+12
| | | | | | <rdar://problem/14292693> llvm-svn: 207181
* blockfreq: Use LoopData directlyDuncan P. N. Exon Smith2014-04-251-30/+28
| | | | | | | | Instead of passing around loop headers, pass around `LoopData` directly. <rdar://problem/14292693> llvm-svn: 207179
* blockfreq: Use a std::list for LoopsDuncan P. N. Exon Smith2014-04-251-1/+1
| | | | | | | | | | | | As pointed out by David Blaikie in code review, a `std::list<T>` is simpler than a `std::vector<std::unique_ptr<T>>`. Another option is a `std::deque<T>` (which allocates in chunks), but I'd like to leave open the option of inserting in the middle of the sequence for handling irreducible control flow on the fly. <rdar://problem/14292693> llvm-svn: 207177
* Allow vectorization of bit intrinsics in BB Vectorizer.Karthik Bhat2014-04-251-8/+21
| | | | | | This patch adds support for vectorization of bit intrinsics such as bswap,ctpop,ctlz,cttz. llvm-svn: 207174
* ProfileData: Treat missing function counts as malformedJustin Bogner2014-04-251-2/+6
| | | | llvm-svn: 207172
* Fix quadratic performance during debug compression due to sections x symbols ↵David Blaikie2014-04-251-12/+21
| | | | | | | | | | | | | | | | iteration. When fixing the symbols in each compressed section we were iterating over all symbols for each compressed section. In extreme cases this could snowball severely (5min uncompressed -> 35min compressed) due to iterating over all symbols for each compressed section (large numbers of compressed sections can be generated by DWARF type units). To address this, build a map of the symbols in each section ahead of time, and access that map if a section is being compressed. This brings compile time for the aforementioned example down to ~6 minutes. llvm-svn: 207167
* Revert "This reapplies r207130 with an additional testcase+and a missing ↵Adrian Prantl2014-04-258-73/+45
| | | | | | | | check for" Typo in testcase. llvm-svn: 207166
* This reapplies r207130 with an additional testcase+and a missing check forAdrian Prantl2014-04-258-45/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | AllocaInst that was missing in one location. Debug info for optimized code: Support variables that are on the stack and described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine.ll testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207165
* Revert "Debug info for optimized code: Support variables that are on the ↵Adrian Prantl2014-04-257-66/+37
| | | | | | | | stack and" This reverts commit 207130 for buildbot breakage. llvm-svn: 207162
* Revert "Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the ↵Adrian Prantl2014-04-241-3/+17
| | | | | | | | location" This reverts commit 207130 for buildbot breakage. llvm-svn: 207159
* [DWARF parser] Make a few methods non-publicAlexey Samsonov2014-04-243-10/+12
| | | | llvm-svn: 207156
* [DWARF parser] DWARFUnit ctor doesn't need both parsed and raw .debug_abbrev ↵Alexey Samsonov2014-04-245-36/+34
| | | | | | section. Remove the former. llvm-svn: 207153
* [DWARF parser] Simplify and re-format a methodAlexey Samsonov2014-04-242-11/+13
| | | | llvm-svn: 207151
* [LCG] Switch a weird do/while loop that actually couldn't fail itsChandler Carruth2014-04-241-5/+4
| | | | | | | condition into an obviously infinite loop with an assert about the degenerate condition. No functionality changed. llvm-svn: 207147
* X86: Don't transform shifts into ands when the sign bit is tested.Benjamin Kramer2014-04-241-1/+2
| | | | | | Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode. llvm-svn: 207145
* Add 'musttail' marker to call instructionsReid Kleckner2014-04-2414-12/+145
| | | | | | | | | | | | This is similar to the 'tail' marker, except that it guarantees that tail call optimization will occur. It also comes with convervative IR verification rules that ensure that tail call optimization is possible. Reviewers: nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D3240 llvm-svn: 207143
* Remove C++11ism (specializing a template in a surrounding namespace) to ↵Richard Smith2014-04-241-3/+5
| | | | | | appease the buildbots. llvm-svn: 207136
* Debug info: Let dbg.values inserted by LowerDbgDeclare inherit the locationAdrian Prantl2014-04-241-17/+3
| | | | | | | | | of the dbg.value. This gets rid of tons of redundant variable DIEs in subscopes. rdar://problem/14874886, rdar://problem/16679936 llvm-svn: 207135
* [modules] "Specialize" a function by actually specializing a function templateRichard Smith2014-04-241-8/+9
| | | | | | | rather than by adding an overload and hoping that it's declared before the code that calls it. (In a modules build, it isn't.) llvm-svn: 207133
* Debug info for optimized code: Support variables that are on the stack andAdrian Prantl2014-04-247-37/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | described by DBG_VALUEs during their lifetime. Previously, when a variable was at a FrameIndex for any part of its lifetime, this would shadow all other DBG_VALUEs and only a single fbreg location would be emitted, which in fact is only valid for a small range and not the entire lexical scope of the variable. The included dbg-value-const-byref testcase demonstrates this. This patch fixes this by Local - emitting dbg.value intrinsics for allocas that are passed by reference - dropping all dbg.declares (they are now fully lowered to dbg.values) SelectionDAG - renamed constructors for SDDbgValue for better readability. - fix UserValue::match() to handle indirect values correctly - not inserting an MMI table entries for dbg.values that describe allocas. - lowering dbg.values that describe allocas into *indirect* DBG_VALUEs. CodeGenPrepare - leaving dbg.values for an alloca were they are (see comment) Other - regenerated/updated instcombine-intrinsics testcase and included source rdar://problem/16679879 http://reviews.llvm.org/D3374 llvm-svn: 207130
* [X86] Add support for Read Time Stamp Counter x86 builtin intrinsics.Andrea Di Biagio2014-04-244-33/+104
| | | | | | | | | | | | | | This patch: - Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and 'int_x86_rdtscp') as GCCBuiltin intrinsics; - Teaches the backend how to lower the two new builtins; - Introduces a common function to lower READCYCLECOUNTER dag nodes and the two new rdtsc/rdtscp intrinsics; - Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll' correctly verifies that both READCYCLECOUNTER and the two new intrinsics work fine for both 64bit and 32bit Subtargets. llvm-svn: 207127
* R600/SI: Use address space in allowsUnalignedMemoryAccessesMatt Arsenault2014-04-241-0/+30
| | | | llvm-svn: 207126
* Spread some const around for non-mutating uses of MCSymbolData.David Blaikie2014-04-248-25/+26
| | | | | | | | I discovered this const-hole while attempting to coalesnce the Symbol and SymbolMap data structures. There's some pending issues with that, but I figured this change was easy to flush early. llvm-svn: 207124
* [mips] Remove non-ascii character.Matheus Almeida2014-04-241-1/+1
| | | | llvm-svn: 207123
* Fix memory leak of MCSymbolData in MCAsmStreamer.David Blaikie2014-04-241-8/+10
| | | | | | | | | Leak identified by LSan and reported by Kostya Serebryany. Let's get a bit experimental here... in theory our minimum compiler versions support unordered_map. llvm-svn: 207118
* AArch64: print NEON lists with a space.Tim Northover2014-04-241-2/+2
| | | | | | | This matches ARM64 behaviour, which I think is clearer. It also puts all the churn from that difference into one easily ignored commit. llvm-svn: 207116
* [asan] Use MCInstrInfo in inline asm instrumentation.Evgeniy Stepanov2014-04-243-26/+17
| | | | | | Patch by Yuri Gorshenin. llvm-svn: 207115
* AArch64/ARM64: allow negative addends, at least on ELF.Tim Northover2014-04-241-14/+18
| | | | llvm-svn: 207111
* ARM64: support relocated "TBZ/TBNZ" instructions.Tim Northover2014-04-241-0/+2
| | | | llvm-svn: 207110
* AArch64/ARM64: support relocated ADR instructionTim Northover2014-04-241-1/+2
| | | | llvm-svn: 207109
* AArch64/ARM64: add support for :abs_gN_s: MOVZ modifiersTim Northover2014-04-246-0/+33
| | | | | | We only need assembly support, so it's fairly easy. llvm-svn: 207108
* ARM64: shut up warning about variable only used in assert.Tim Northover2014-04-241-0/+1
| | | | llvm-svn: 207106
* AArch64/ARM64: disentangle the "B.CC" and "LDR lit" operandsTim Northover2014-04-2410-37/+77
| | | | | | | | | | | | | These can have different relocations in ELF. In particular both: b.eq global ldr x0, global are valid, giving different relocations. The only possible way to distinguish them is via a different fixup, so the operands had to be separated throughout the backend. llvm-svn: 207105
* AArch64/ARM64: implement BFI optimisationTim Northover2014-04-241-45/+122
| | | | | | | | | | | ARM64 was not producing pure BFI instructions for bitfield insertion operations, unlike AArch64. The approach had to be a little different (in ISelDAGToDAG rather than ISelLowering), and the outcomes aren't identical but hopefully this gives it similar power. This should address PR19424. llvm-svn: 207102
* [LCG] Incorporate the core trick of improvements on the naive Tarjan'sChandler Carruth2014-04-241-41/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | algorithm here: http://dl.acm.org/citation.cfm?id=177301. The idea of isolating the roots has even more relevance when using the stack not just to implement the DFS but also to implement the recursive step. Because we use it for the recursive step, to isolate the roots we need to maintain two stacks: one for our recursive DFS walk, and another of the nodes that have been walked. The nice thing is that the latter will be half the size. It also fixes a complete hack where we scanned backwards over the stack to find the next potential-root to continue processing. Now that is always the top of the DFS stack. While this is a really nice improvement already (IMO) it further opens the door for two important simplifications: 1) De-duplicating some of the code across the two different walks. I've actually made the duplication a bit worse in some senses with this patch because the two are starting to converge. 2) Dramatically simplifying the loop structures of both walks. I wanted to do those separately as they'll be essentially *just* CFG restructuring. This patch on the other hand actually uses different datastructures to implement the algorithm itself. llvm-svn: 207098
* [LCG] Rotate logic applied to the top of the DFSStack to instead beChandler Carruth2014-04-241-25/+24
| | | | | | | | | | | | | applied prior to pushing a node onto the DFSStack. This is the first step toward avoiding the stack entirely for leaf nodes. It also simplifies things a bit and I think is pointing the way toward factoring some more of the shared logic out of the two implementations. It is also making it more obvious how to restructure the loops themselves to be a bit easier to read (although no different in terms of functionality). llvm-svn: 207095
* [asan] Fix instrumentation of x86 intel syntax inline assembly.Evgeniy Stepanov2014-04-241-15/+15
| | | | | | Patch by Yuri Gorshenin. llvm-svn: 207092
* [LCG] Switch the parent SCC tracking from a SmallSetVector toChandler Carruth2014-04-241-2/+2
| | | | | | | | | | | | | | | | a SmallPtrSet. Currently, there is no need for stable iteration in this dimension, and I now thing there won't need to be going forward. If this is ever re-introduced in any form, it needs to not be a SetVector based solution because removal cannot be linear. There will be many SCCs with large numbers of parents. When encountering these, the incremental SCC update for intra-SCC edge removal was quadratic due to linear removal (kind of). I'm really hoping we can avoid having an ordering property here at all though... llvm-svn: 207091
* [LCG] We don't actually need a set in each SCC to track the nodes. WeChandler Carruth2014-04-241-7/+1
| | | | | | | can use the node -> SCC mapping in the top-level graph to test this on the rare occasions we need it. llvm-svn: 207090
* X86: Emit test instead of constant shift + compare if the shift result is ↵Benjamin Kramer2014-04-241-21/+43
| | | | | | | | | | | | | | | | | | unused. This allows us to compile return (mask & 0x8 ? a : b); into testb $8, %dil cmovnel %edx, %esi instead of andl $8, %edi shrl $3, %edi cmovnel %edx, %esi which we formed previously because dag combiner canonicalizes setcc of and into shift. llvm-svn: 207088
OpenPOWER on IntegriCloud