summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Confusing comment typo.Andrew Trick2013-08-071-1/+1
| | | | llvm-svn: 187895
* Remove some parens. No functional change.Eric Christopher2013-08-071-4/+4
| | | | llvm-svn: 187872
* Add a way to grab a particular attribute out of a DIE.Eric Christopher2013-08-073-14/+22
| | | | | | | Use it when we're looking for a string in particular. Update comments as well. llvm-svn: 187844
* Move somewhat messy conditional out of line.Eric Christopher2013-08-071-7/+11
| | | | | | No functional change. llvm-svn: 187843
* LoopVectorize: Allow vectorization of loops with lifetime markersArnold Schwaighofer2013-08-061-0/+3
| | | | | | Patch by Marc Jessome! llvm-svn: 187825
* Refactor isInTailCallPosition handlingTim Northover2013-08-061-134/+287
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change came about primarily because of two issues in the existing code. Niether of: define i64 @test1(i64 %val) { %in = trunc i64 %val to i32 tail call i32 @ret32(i32 returned %in) ret i64 %val } define i64 @test2(i64 %val) { tail call i32 @ret32(i32 returned undef) ret i32 42 } should be tail calls, and the function sameNoopInput is responsible. The main problem is that it is completely symmetric in the "tail call" and "ret" value, but in reality different things are allowed on each side. For these cases: 1. Any truncation should lead to a larger value being generated by "tail call" than needed by "ret". 2. Undef should only be allowed as a source for ret, not as a result of the call. Along the way I noticed that a mismatch between what this function treats as a valid truncation and what the backends see can lead to invalid calls as well (see x86-32 test case). This patch refactors the code so that instead of being based primarily on values which it recurses into when necessary, it starts by inspecting the type and considers each fundamental slot that the backend will see in turn. For example, given a pathological function that returned {{}, {{}, i32, {}}, i32} we would consider each "real" i32 in turn, and ask if it passes through unchanged. This is much closer to what the backend sees as a result of ComputeValueVTs. Aside from the bug fixes, this eliminates the recursion that's going on and, I believe, makes the bulk of the code significantly easier to understand. The trade-off is the nasty iterators needed to find the real types inside a returned value. llvm-svn: 187787
* AsmPrinter/CMakeLists.txt: Add explicit dependency to intrinsics_gen here.NAKAMURA Takumi2013-08-061-0/+2
| | | | llvm-svn: 187778
* Recommit previous cleanup with a fix for c++98 ambiguity.Eric Christopher2013-08-051-5/+2
| | | | llvm-svn: 187752
* TargetLowering: Add getVectorIdxTy() function v2Tom Stellard2013-08-058-77/+98
| | | | | | | | | | | | | | | | | | | | | This virtual function can be implemented by targets to specify the type to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT, INSERT_SUBVECTOR, EXTRACT_SUBVECTOR. The default implementation returns the result from TargetLowering::getPointerTy() The previous code was using TargetLowering::getPointerTy() for vector indices, because this is guaranteed to be legal on all targets. However, using TargetLowering::getPointerTy() can be a problem for targets with pointer sizes that differ across address spaces. On such targets, when vectors need to be loaded or stored to an address space other than the default 'zero' address space (which is the address space assumed by TargetLowering::getPointerTy()), having an index that is a different size than the pointer can lead to inefficient pointer calculations, (e.g. 64-bit adds for a 32-bit address space). There is no intended functionality change with this patch. llvm-svn: 187748
* Revert "Use existing builtin hashing functions to make this routine more"Eric Christopher2013-08-051-2/+5
| | | | | | This reverts commit r187745. llvm-svn: 187747
* Use existing builtin hashing functions to make this routine moreEric Christopher2013-08-051-5/+2
| | | | | | simple. llvm-svn: 187745
* Change parent hashing algorithm to be non-recursive and elaborateEric Christopher2013-08-051-20/+32
| | | | | | greatly on many comments in the code. llvm-svn: 187742
* Don't leak passes if added outside of the area determined by Started/Stopped ↵Benjamin Kramer2013-08-051-0/+2
| | | | | | flags. llvm-svn: 187722
* Bugfix for making the DWARF debug strings and labels to code emitted as ↵Carlo Kok2013-08-021-0/+4
| | | | | | secrel32 instead of long opcodes (only for coff). This makes them debuggable with GDB (with fix for 64bits msvc) llvm-svn: 187656
* Revert r187597, "Bugfix for making the DWARF debug strings and labels to ↵NAKAMURA Takumi2013-08-021-4/+0
| | | | | | | | code emitted as secrel32 instead of long opcodes (only for coff). This makes them debuggable with GDB." It broke x86_64-win32 builder in llvm/test/DebugInfo. llvm-svn: 187642
* Use function attributes to indicate that we don't want to realign the stack.Bill Wendling2013-08-011-1/+2
| | | | | | | | Function attributes are the future! So just query whether we want to realign the stack directly from the function instead of through a random target options structure. llvm-svn: 187618
* DebugInfo: Emit definitions for types with no members.David Blaikie2013-08-011-7/+3
| | | | | | The absence of members was a poor/incorrect proxy for "is definition". llvm-svn: 187607
* Bugfix for making the DWARF debug strings and labels to code emitted as ↵Carlo Kok2013-08-011-0/+4
| | | | | | | | secrel32 instead of long opcodes (only for coff). This makes them debuggable with GDB. fixes Bug 16249 - LLVM generates broken debug info on Windows llvm-svn: 187597
* Fix crashing on invalid inline asm with matching constraints.Eric Christopher2013-07-311-15/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For a testcase like the following: typedef unsigned long uint64_t; typedef struct { uint64_t lo; uint64_t hi; } blob128_t; void add_128_to_128(const blob128_t *in, blob128_t *res) { asm ("PAND %1, %0" : "+Q"(*res) : "Q"(*in)); } where we'll fail to allocate the register for the output constraint, our matching input constraint will not find a register to match, and could try to search past the end of the current operands array. On the idea that we'd like to attempt to keep compilation going to find more errors in the module, change the error cases when we're visiting inline asm IR to return immediately and avoid trying to create a node in the DAG. This leaves us with only a single error message per inline asm instruction, but allows us to safely keep going in the general case. llvm-svn: 187470
* Reflow this to be easier to read.Eric Christopher2013-07-301-7/+5
| | | | llvm-svn: 187459
* Down-scale slot index distance to save bits.Andrew Trick2013-07-301-1/+1
| | | | llvm-svn: 187438
* MI Sched: Track live-thru registers.Andrew Trick2013-07-302-12/+67
| | | | | | | | | | | When registers must be live throughout the scheduling region, increase the limit for the register class. Once we exceed the original limit, they will be spilled, and there's no point further reducing pressure. This isn't a perfect heuristics but avoids a situation where the scheduler could become trapped by trying to achieve the impossible. llvm-svn: 187436
* MI Sched fix: assert "Disconnected LRG within the scheduling region."Andrew Trick2013-07-301-0/+6
| | | | llvm-svn: 187435
* [DAGCombiner] insert_vector_elt: Avoid building a vector twice.Quentin Colombet2013-07-301-1/+3
| | | | | | | | | | | | | | | | This patch prevents the following combine when the input vector is used more than once. insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx => build_vector elt0, ..., NewEltIdx, ..., eltN The reasons are: - Building a vector may be expensive, so try to reuse the existing part of a vector instead of creating a new one (think big vectors). - elt0 to eltN now have two users instead of one. This may prevent some other optimizations. llvm-svn: 187396
* Fix a truly egregious thinko in anonymous namespace check,Eric Christopher2013-07-291-2/+2
| | | | | | | | update testcase to make sure we generate debug info for walrus by adding a non-trivial constructor and verify that we don't emit an ODR signature for the type. llvm-svn: 187393
* Make sure we don't emit an ODR hash for types with no name and makeEric Christopher2013-07-291-2/+4
| | | | | | sure the comments for each testcase are a bit easier to distinguish. llvm-svn: 187392
* Elaborate a bit on the type unit and ODR conditional code.Eric Christopher2013-07-291-4/+4
| | | | llvm-svn: 187385
* Use proper section suffix for COFF weak symbolsNico Rieck2013-07-291-4/+2
| | | | | | | | | 32-bit symbols have "_" as global prefix, but when forming the name of COMDAT sections this prefix is ignored. The current behavior assumes that this prefix is always present which is not the case for 64-bit and names are truncated. llvm-svn: 187356
* DwarfDebug: MD5 is always little endian, bswap on big endian platforms.Benjamin Kramer2013-07-271-2/+3
| | | | | | This makes LLVM emit the same signature regardless of host and target endianess. llvm-svn: 187304
* Fix a memory leak in the debug emission by simply not allocating memory.Chandler Carruth2013-07-271-2/+2
| | | | | | | | There doesn't appear to be any reason to put this variable on the heap. I'm suspicious of the LexicalScope above that we stuff in a map and then delete afterward, but I'm just trying to get the valgrind bot clean. llvm-svn: 187301
* Reimplement isPotentiallyReachable to make nocapture deduction much stronger.Nick Lewycky2013-07-271-0/+1
| | | | | | | | | | Adds unit tests for it too. Split BasicBlockUtils into an analysis-half and a transforms-half, and put the analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable into llvm::isPotentiallyReachable and move it into Analysis/CFG. llvm-svn: 187283
* SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵Tom Stellard2013-07-271-0/+3
| | | | | | | | | | | | | | conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
* Remove addLetterToHash, no functional change.Eric Christopher2013-07-261-10/+1
| | | | llvm-svn: 187245
* Add preliminary support for hashing DIEs and breaking them intoEric Christopher2013-07-265-4/+254
| | | | | | | | | | | | | | | | type units. Initially this support is used in the computation of an ODR checker for C++. For now we're attaching it to the DIE, but in the future it will be attached to the type unit. This also starts breaking out types into the separation for type units, but without actually splitting the DIEs. In preparation for hashing the DIEs this adds a DIEString type that contains a StringRef with the string contained at the label. llvm-svn: 187213
* Add a target legalize hook for SplitVectorOperand (again)Justin Holewinski2013-07-261-0/+4
| | | | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 Attempt to fix the buildbots by making the X86 test I just added platform independent llvm-svn: 187202
* Revert "Add a target legalize hook for SplitVectorOperand"Rafael Espindola2013-07-261-4/+0
| | | | | | | | | | This reverts commit 187198. It broke the bots. The soft float test probably needs a -triple because of name differences. On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of "vroundss $1, %xmm0, %xmm0, %xmm0". llvm-svn: 187201
* Add a target legalize hook for SplitVectorOperandJustin Holewinski2013-07-261-0/+4
| | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 llvm-svn: 187198
* RegAllocGreedy comment.Andrew Trick2013-07-251-1/+2
| | | | llvm-svn: 187141
* Evict local live ranges if they can be reassigned.Andrew Trick2013-07-251-1/+29
| | | | | | | | | | | | | | | | The previous change to local live range allocation also suppressed eviction of local ranges. In rare cases, this could result in more expensive register choices. This commit actually revives a feature that I added long ago: check if live ranges can be reassigned before eviction. But now it only happens in rare cases of evicting a local live range because another local live range wants a cheaper register. The benefit is improved code size for some benchmarks on x86 and armv7. I measured no significant compile time increase and performance changes are noise. llvm-svn: 187140
* Allocate local registers in order for optimal coloring.Andrew Trick2013-07-251-4/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also avoid locals evicting locals just because they want a cheaper register. Problem: MI Sched knows exactly how many registers we have and assumes they can be colored. In cases where we have large blocks, usually from unrolled loops, greedy coloring fails. This is a source of "regressions" from the MI Scheduler on x86. I noticed this issue on x86 where we have long chains of two-address defs in the same live range. It's easy to see this in matrix multiplication benchmarks like IRSmk and even the unit test misched-matmul.ll. A fundamental difference between the LLVM register allocator and conventional graph coloring is that in our model a live range can't discover its neighbors, it can only verify its neighbors. That's why we initially went for greedy coloring and added eviction to deal with the hard cases. However, for singly defined and two-address live ranges, we can optimally color without visiting neighbors simply by processing the live ranges in instruction order. Other beneficial side effects: It is much easier to understand and debug regalloc for large blocks when the live ranges are allocated in order. Yes, global allocation is still very confusing, but it's nice to be able to comprehend what happened locally. Heuristics could be added to bias register assignment based on instruction locality (think late register pairing, banks...). Intuituvely this will make some test cases that are on the threshold of register pressure more stable. llvm-svn: 187139
* typo.Adrian Prantl2013-07-251-1/+1
| | | | llvm-svn: 187135
* MI Sched: Register pressure heuristics.Andrew Trick2013-07-251-8/+32
| | | | | | Consider which set is being increased or decreased before comparing. llvm-svn: 187110
* MI Sched: track register pressure by importance of the set, not weight of ↵Andrew Trick2013-07-251-14/+20
| | | | | | the units. llvm-svn: 187109
* Dump LIS before regalloc. MI sched changes them.Andrew Trick2013-07-252-2/+4
| | | | llvm-svn: 187107
* Replace the "NoFramePointerElimNonLeaf" target option with a function attribute.Bill Wendling2013-07-251-0/+4
| | | | | | | | There's no need to specify a flag to omit frame pointer elimination on non-leaf nodes...(Honestly, I can't parse that option out.) Use the function attribute stuff instead. llvm-svn: 187093
* Fix a bug in IfConverter with nested predicates.Quentin Colombet2013-07-241-3/+3
| | | | | | | | | | | | | | | | | | | Prior to this patch, IfConverter may widen the cases where a sequence of instructions were executed because of the way it uses nested predicates. This result in incorrect execution. For instance, Let A be a basic block that flows conditionally into B and B be a predicated block. B can be predicated with A.BrToBPredicate into A iff B.Predicate is less "permissive" than A.BrToBPredicate, i.e., iff A.BrToBPredicate subsumes B.Predicate. The IfConverter was checking the opposite: B.Predicate subsumes A.BrToBPredicate. <rdar://problem/14379453> llvm-svn: 187071
* DAGCombiner: Pass the correct type to TargetLowering::isF(Abs|Neg)FreeTom Stellard2013-07-231-2/+2
| | | | | | | This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007
* Reformat options.Eric Christopher2013-07-231-40/+41
| | | | llvm-svn: 186994
* [stackprotector] Changed isNoopBitcast/sameNoopInput to take ↵Michael Gottesman2013-07-221-2/+2
| | | | | | | | | | TargetLoweringBase instead of TargetLowering. Both functions only use functionality from TargetLoweringBase. rdar://13935163 llvm-svn: 186874
* [stackprotector] Refactored ssp prologue creation code into its own helper ↵Michael Gottesman2013-07-221-35/+41
| | | | | | | | | | function. No functionality change. rdar://13935163 llvm-svn: 186868
OpenPOWER on IntegriCloud