summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* COFF: Emit all MCSymbols rather than filtering out some of themReid Kleckner2013-09-171-19/+3
| | | | | | | | | | In particular, this means we emit non-external symbols defined to variables, such as aliases or absolute addresses. This is needed to implement /safeseh, and it appears there was some confusion about what symbols to emit previously. llvm-svn: 190888
* COFF: Remove ExportSection, which has been dead since r114823Reid Kleckner2013-09-171-5/+0
| | | | llvm-svn: 190887
* Move variable into assert to avoid unused variable warning.Eric Christopher2013-09-171-2/+1
| | | | llvm-svn: 190886
* Cleanup handling of constant function casts.Matt Arsenault2013-09-171-24/+8
| | | | | | | | | | Some of this code is no longer necessary since int<->ptr casts are no longer occur as of r187444. This also fixes handling vectors of pointers, and adds a bunch of new testcases for vectors and address spaces. llvm-svn: 190885
* [PowerPC] Add a FIXME.Bill Schmidt2013-09-171-0/+4
| | | | | | | | Documenting a design choice to generate only medium model sequences for TLS addresses at this time. Small and large code models could be supported if necessary. llvm-svn: 190883
* [PowerPC] Fix problems with large code model (PR17169).Bill Schmidt2013-09-172-8/+22
| | | | | | | | | | | | | | Large code model on PPC64 requires creating and referencing TOC entries when using the addis/ld form of addressing. This was not being done in all cases. The changes in this patch to PPCAsmPrinter::EmitInstruction() fix this. Two test cases are also modified to reflect this requirement. Fast-isel was not creating correct code for loading floating-point constants using large code model. This also requires the addis/ld form of addressing. Previously we were using the addis/lfd shortcut which is only applicable to medium code model. One test case is modified to reflect this requirement. llvm-svn: 190882
* Costmodel: Add support for horizontal vector reductionsArnold Schwaighofer2013-09-173-0/+296
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upcoming SLP vectorization improvements will want to be able to estimate costs of horizontal reductions. Add infrastructure to support this. We model reductions as a series of (shufflevector,add) tuples ultimately followed by an extractelement. For example, for an add-reduction of <4 x float> we could generate the following sequence: (v0, v1, v2, v3) \ \ / / \ \ / + + (v0+v2, v1+v3, undef, undef) \ / ((v0+v2) + (v1+v3), undef, undef) %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 2, i32 3, i32 undef, i32 undef> %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7 %r = extractelement <4 x float> %bin.rdx8, i32 0 This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)" that will allow clients to ask for the cost of such a reduction (as backends might generate more efficient code than the cost of the individual instructions summed up). This interface is excercised by the CostModel analysis pass which looks for reduction patterns like the one above - starting at extractelements - and if it sees a matching sequence will call the cost model interface. We will also support a second form of pairwise reduction that is well supported on common architectures (haddps, vpadd, faddp). (v0, v1, v2, v3) \ / \ / (v0+v1, v2+v3, undef, undef) \ / ((v0+v1)+(v2+v3), undef, undef, undef) %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef> %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef, <4 x i32> <i32 1, i32 3, i32 undef, i32 undef> %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1 %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef> %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef, <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef> %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1 %r = extractelement <4 x float> %bin.rdx.1, i32 0 llvm-svn: 190876
* SLPVectorizer: Don't vectorize phi nodes that use invoke valuesArnold Schwaighofer2013-09-171-0/+12
| | | | | | | | | We can't insert an insertelement after an invoke. We would have to split a critical edge. So when we see a phi node that uses an invoke we just give up. radar://14990770 llvm-svn: 190871
* [InstCombiner] Slice a big load in two loads when the elements are next to eachQuentin Colombet2013-09-171-0/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | other in memory. The motivation was to get rid of truncate and shift right instructions that get in the way of paired load or floating point load. E.g., Consider the following example: struct Complex { float real; float imm; }; When accessing a complex, llvm was generating a 64-bits load and the imm field was obtained by a trunc(lshr) sequence, resulting in poor code generation, at least for x86. The idea is to declare that two load instructions is the canonical form for loading two arithmetic type, which are next to each other in memory. Two scalar loads at a constant offset from each other are pretty easy to detect for the sorts of passes that like to mess with loads. <rdar://problem/14477220> llvm-svn: 190870
* Remove unused code, which had been commented out.Preston Gurd2013-09-171-5/+0
| | | | llvm-svn: 190869
* Added documentation to getMemsetStores.Serge Pavlov2013-09-171-0/+18
| | | | llvm-svn: 190866
* Add llvm.x86.* intrinsics for Intel SHA ExtensionsBen Langmuir2013-09-171-14/+26
| | | | | | | | Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as well as tests. Also remove mayLoad and hasSideEffects, which can be inferred from the instruction patterns. llvm-svn: 190864
* [asan] inline the calls to __asan_stack_free_* with small sizes. Yet another ↵Kostya Serebryany2013-09-171-3/+48
| | | | | | 10%-20% speedup for use-after-return llvm-svn: 190863
* [ARM] Fix the deprecation of MCR encodings that map to CP15{ISB,DSB,DMB}.Joey Gouly2013-09-172-9/+27
| | | | llvm-svn: 190862
* Bugfix for PR17099:Stepan Dyatkovskiy2013-09-171-8/+15
| | | | | | | | | | | Wrong cast operation. MergeFunctions emits Bitcast instead of pointer-to-integer operation. Patch fixes MergeFunctions::writeThunk function. It replaces unconditional Bitcast creation with "Value* createCast(...)" method, that checks operand types and selects proper instruction. See unit-test as example. llvm-svn: 190859
* AVX-512: Converted to Unix styleElena Demikhovsky2013-09-171-3070/+3070
| | | | llvm-svn: 190851
* Add AES and SHA instructions to the load folding tables.Craig Topper2013-09-171-0/+25
| | | | llvm-svn: 190850
* Fix column alignment. No functional change.Craig Topper2013-09-171-4/+4
| | | | llvm-svn: 190849
* Implement 3 AArch64 neon instructions : umov smov ins.Kevin Qin2013-09-175-14/+468
| | | | llvm-svn: 190839
* [SelectionDAG] Teach the vector scalarizer about TRUNCATE.Quentin Colombet2013-09-172-3/+4
| | | | | | | | | | | When a truncate node defines a legal vector type but uses an illegal vector type, the legalization process was splitting the vector until <1 x vector> type, but then it was failing to scalarize the node because it did not know how to handle TRUNCATE. <rdar://problem/14989896> llvm-svn: 190830
* Debug info: Fix PR16736 and rdar://problem/14990587.Adrian Prantl2013-09-165-5/+7
| | | | | | | A DBG_VALUE is register-indirect iff the first operand is a register _and_ the second operand is an immediate. llvm-svn: 190821
* MemCpyOptimizer: Use max legal int size instead of pointer sizeMatt Arsenault2013-09-162-5/+15
| | | | | | | | | | | | If there are no legal integers, assume 1 byte. This makes more sense than using the pointer size as a guess for the maximum GPR width. It is conceivable to want to use some 64-bit pointers on a target where 64-bit integers aren't legal. llvm-svn: 190817
* Use reference instead of copy.Jakub Staszak2013-09-162-3/+3
| | | | llvm-svn: 190813
* [PowerPC] Fix PR17155 - Ignore COPY_TO_REGCLASS during emit.Bill Schmidt2013-09-161-1/+8
| | | | | | | | | Fast-isel generates a COPY_TO_REGCLASS for widening f32 to f64, which is a nop on PPC64. This is needed to keep the register class system happy, but on the fast-isel path it is not removed before emit as it is for DAG select. Ignore this op when emitting instructions. llvm-svn: 190795
* Don't vectorize if there are outside loop users of the induction variable.Arnold Schwaighofer2013-09-161-0/+6
| | | | | | | | | | | | We would have to compute the pre increment value, either by computing it on every loop iteration or by splitting the edge out of the loop and inserting a computation for it there. For now, just give up vectorizing such loops. Fixes PR17179. llvm-svn: 190790
* [msan] Check return value of main().Evgeniy Stepanov2013-09-161-4/+13
| | | | llvm-svn: 190782
* This patch implements Mips load/store instructions from/to coprocessor 2. ↵Vladimir Medic2013-09-163-1/+60
| | | | | | Test cases are added. llvm-svn: 190780
* ARM: Deduplicate ConstantPoolValues.Benjamin Kramer2013-09-162-47/+36
| | | | llvm-svn: 190779
* [SystemZ] Improve extload handlingRichard Sandiford2013-09-162-74/+92
| | | | | | | | | | | | The port originally had special patterns for extload, mapping them to the same instructions as sextload. It seemed neater to have patterns that match "an extension that is allowed to be signed" and "an extension that is allowed to be unsigned". This was originally meant to be a clean-up, but it does improve the handling of promoted integers a little, as shown by args-06.ll. llvm-svn: 190777
* Make F16C feature flag imply AVX rather than just checking both at the patterns.Craig Topper2013-09-162-2/+3
| | | | llvm-svn: 190775
* Implement function prefix data as an IR feature.Peter Collingbourne2013-09-1614-4/+108
| | | | | | | | | Previous discussion: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html Differential Revision: http://llvm-reviews.chandlerc.com/D1191 llvm-svn: 190773
* PPC: Don't restrict lvsl generation to after type legalizationHal Finkel2013-09-151-1/+2
| | | | | | | | | | | | | | | | | | | This is a re-commit of r190764, with an extra check to make sure that we're not performing the transformation on illegal types (a small test case has been added for this as well). Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190771
* Replace some unnecessary vector copies with references.Benjamin Kramer2013-09-155-9/+7
| | | | llvm-svn: 190770
* ELF: Add support for the exclude section bit for gas compat.Benjamin Kramer2013-09-153-5/+13
| | | | llvm-svn: 190769
* MC: Add support for '?' flags in .section directivesDavid Majnemer2013-09-151-2/+20
| | | | | | | | | | | | | | | | | | | Summary: The '?' flag uses the last section group if the last had a section group. We treat combining an explicit section group and the '?' as a hard error. This fixes PR17198. Reviewers: rafael, bkramer Reviewed By: bkramer CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1686 llvm-svn: 190768
* Fix alignment of unwind data.Kai Nacke2013-09-151-7/+12
| | | | | | | | | | | For alignment purposes, the instruction array will always have an even number of entries, with the final entry potentially unused (in which case the array will be one longer than indicated by the count of unwind codes field). Reviewed by Anton Korobeynikov, Charles Davis and Nico Rieck. llvm-svn: 190767
* Generate IMAGE_REL_AMD64_ADDR32NB relocations for SEHKai Nacke2013-09-151-5/+21
| | | | | | | | | | | | | | data structures. The Win64 EH data structures must be of type IMAGE_REL_AMD64_ADDR32NB instead of IMAGE_REL_AMD64_ADDR32. This is easiely achieved by adding the VK_COFF_IMGREL32 modifier to the symbol reference. Change also references to start and end of the SEH range of a function as offsets to start of the function. Reviewed by Jim Grosbach, Charles Davis and Nico Rieck. llvm-svn: 190766
* Revert r190764: PPC: Don't restrict lvsl generation to after type legalizationHal Finkel2013-09-151-0/+1
| | | | | | | | | | | | | | | | | This is causing test-suite failures. Original commit message: The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190765
* PPC: Don't restrict lvsl generation to after type legalizationHal Finkel2013-09-151-1/+0
| | | | | | | | | | | | | The PPC backend uses a target-specific DAG combine to turn unaligned Altivec loads into a permutation-based sequence when possible. Unfortunately, the target-specific DAG combine is not always called on all loads of interest (sometimes the routines in DAGCombine call CombineTo such that the new node and users are not added to the worklist); allowing the combine to trigger early (before type legalization) mitigates this problem. Because the autovectorizers only create legal vector types, I don't expect a lot of cases where this optimization is enabled by type legalization in practice. llvm-svn: 190764
* Prevent assert in CombinerGlobalAA with null valuesHal Finkel2013-09-151-1/+1
| | | | | | | DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we can't use AA in this case (if we try, then the casting code in AA will assert). llvm-svn: 190763
* Expand the mask capability for deciding which functions are mips16 and mips32Reed Kotler2013-09-151-2/+3
| | | | | | | so it can be better used for general interoperability testing between mips32 and mips16. llvm-svn: 190762
* Remove unused StringRef that no compiler warned about, I wonder why.Benjamin Kramer2013-09-141-1/+0
| | | | llvm-svn: 190759
* Add the remaining Intel SHA instructionsBen Langmuir2013-09-141-0/+27
| | | | | | | Also assembly/disassembly tests, and for sha256rnds2, aliases with an explicit xmm0 dependency. llvm-svn: 190754
* Fix spelling.Robert Wilhelm2013-09-141-1/+1
| | | | llvm-svn: 190750
* Fix spelling.Robert Wilhelm2013-09-141-1/+1
| | | | llvm-svn: 190749
* Remove the long, long defunct IR block placement pass.Chandler Carruth2013-09-143-154/+0
| | | | | | | | | | | | | | | | | This pass was based on the previous (essentially unused) profiling infrastructure and the assumption that by ordering the basic blocks at the IR level in a particular way, the correct layout would happen in the end. This sometimes worked, and mostly didn't. It also was a really naive implementation of the classical paper that dates from when branch predictors were primarily directional and when loop structure wasn't commonly available. It also didn't factor into the equation non-fallthrough branches and other machine level details. Anyways, for all of these reasons and more, I wrote MachineBlockPlacement, which completely supercedes this pass. It both uses modern profile information infrastructure, and actually works. =] llvm-svn: 190748
* Fixed bug when generating Load Upper Immediate microMIPS instruction.Zoran Jovanovic2013-09-142-2/+2
| | | | llvm-svn: 190746
* Support for microMIPS DIV instructions.Zoran Jovanovic2013-09-142-1/+5
| | | | llvm-svn: 190745
* Support for misc microMIPS instructions.Zoran Jovanovic2013-09-144-16/+74
| | | | llvm-svn: 190744
* Make PrettyStackTraceEntry use ManagedStatic for its ThreadLocal.Filip Pizlo2013-09-131-8/+20
| | | | | | | | | | This was somewhat tricky because ~PrettyStackTraceEntry() may run after llvm_shutdown() has been called. This is rare and only happens for a common idiom used in the main() functions of command-line tools. This works around the idiom by skipping the stack clean-up if the PrettyStackTraceHead ManagedStatic is not constructed (i.e. llvm_shutdown() has been called). llvm-svn: 190730
OpenPOWER on IntegriCloud