summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Adjust cost of FP_TO_UINT v8f32->v8i32Adam Nemet2014-03-301-0/+39
| | | | | | | | | | | | | | | There is no direct AVX instruction to convert to unsigned. I have some ideas how we may be able to do this with three vector instructions but the current backend just bails on this to get it scalarized. See the comment why we need to adjust the cost returned by BasicTTI. The test is a bit roundabout (and checks assembly rather than bit code) because I'd like it to work even if at some point we could vectorize this conversion. Fixes <rdar://problem/16371920> llvm-svn: 205159
* llvm/test/Transforms/LoopStrengthReduce/ARM64/lsr-*.ll: Add explicit triple ↵NAKAMURA Takumi2014-03-302-2/+2
| | | | | | arm64-unknown for targeting pecoff. llvm-svn: 205125
* ARM64: initial backend importTim Northover2014-03-2911-3/+478
| | | | | | | | | | | | This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. llvm-svn: 205090
* SLPVectorizer: Take credit for free extractelement instructionsArnold Schwaighofer2014-03-281-0/+25
| | | | | | | | | Extract element instructions that will be removed when vectorzing lower the cost. Patch by Arch D. Robison! llvm-svn: 205020
* SLPVectorizer: Ignore users that are insertelements we can reschedule themArnold Schwaighofer2014-03-281-0/+24
| | | | | | Patch by Arch D. Robison! llvm-svn: 205018
* Revert "InstCombine: merge constants in both operands of icmp."Erik Verbruggen2014-03-281-63/+0
| | | | | | | | | This reverts commit r204912, and follow-up commit r204948. This introduced a performance regression, and the fix is not completely clear yet. llvm-svn: 205010
* Revert "GVN: merge overflow intrinsics with non-overflow instructions."Erik Verbruggen2014-03-281-67/+0
| | | | | | | | | This reverts commit r203553, and follow-up commits r203558 and r203574. I will follow this up on the mailinglist to do it in a way that won't cause subtle PRE bugs. llvm-svn: 205009
* InstCombine: Don't combine constants on unsigned icmpsReid Kleckner2014-03-271-0/+10
| | | | | | | | | Fixes a miscompile introduced in r204912. It would miscompile code like (unsigned)(a + -49) <= 5U. The transform would turn this into (unsigned)a < 55U, which would return true for values in [0, 49], when it should not. llvm-svn: 204948
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-271-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds back r204781. Original message: Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204934
* InstCombine: merge constants in both operands of icmp.Erik Verbruggen2014-03-271-0/+53
| | | | | | | | | | Transform: icmp X+Cst2, Cst into: icmp X, Cst-Cst2 when Cst-Cst2 does not overflow, and the add has nsw. llvm-svn: 204912
* [X86][Vectorizer Cost Model] Correct vectorization cost model for v2i64->v2f64Quentin Colombet2014-03-271-0/+26
| | | | | | | | | | and v4i64->v4f64. The new costs match what we did for SSE2 and reflect the reality of our codegen. <rdar://problem/16381225> llvm-svn: 204884
* add 'requires asserts' to test that needs itJim Grosbach2014-03-271-0/+1
| | | | llvm-svn: 204882
* X86: Correct vectorization cost model for v8f32->v8i8.Jim Grosbach2014-03-271-0/+24
| | | | | | | | Fix the cost model to reflect the reality of our codegen. rdar://16370633 llvm-svn: 204880
* Treat lifetime.start'd memory like we treat freshly alloca'd memory. Patch ↵Nick Lewycky2014-03-261-0/+21
| | | | | | by Björn Steinbrink! llvm-svn: 204876
* Revert "Prevent alias from pointing to weak aliases."Rafael Espindola2014-03-261-10/+5
| | | | | | | | | This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-261-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781
* [InstCombine] Don't fold bitcast into store if it would need addrspacecastRichard Osborne2014-03-251-2/+16
| | | | | | | | | | | | | | | | | | Summary: Previously the code didn't check if the before and after types for the store were pointers to different address spaces. This resulted in instcombine using a bitcast to convert between pointers to different address spaces, causing an assertion due to the invalid cast. It is not be appropriate to use addrspacecast this case because it is not guaranteed to be a no-op cast. Instead bail out and do not do the transformation. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3117 llvm-svn: 204733
* Allow constant folding of ceil function whenever feasibleKarthik Bhat2014-03-241-0/+56
| | | | llvm-svn: 204583
* Revert r204076 for now - it caused significant regressions in a number ofLang Hames2014-03-232-57/+4
| | | | | | | | benchmarks. <rdar://problem/16368461> llvm-svn: 204558
* [Constant Hoisting] Fix multiple entries for the same basic block in PHI nodes.Juergen Ributzka2014-03-221-0/+46
| | | | | | | | | | | | | | | | | | | | A PHI node usually has only one value/basic block pair per incoming basic block. In the case of a switch statement it is possible that a following PHI node may have more than one such pair per incoming basic block. E.g.: %0 = phi i64 [ 123456, %case2 ], [ 654321, %Entry ], [ 654321, %Entry ] This is valid and the verfier doesn't complain, because both values are the same. Constant hoisting materializes the constant for each operand separately and the value is still the same, but the variable names have changed. As a result the verfier can't recognize anymore that they are the same value and complains. This fix adds special update code for PHI node in constant hoisting to prevent this corner case. This fixes <rdar://problem/16394449> llvm-svn: 204537
* Sink: Don't sink static allocas from the entry blockTom Stellard2014-03-211-0/+79
| | | | | | | CodeGen treats allocas outside the entry block as dynamically sized stack objects. llvm-svn: 204473
* [Constant Hoisting] Make the constant materialization cost operand dependentJuergen Ributzka2014-03-211-3/+3
| | | | | | | | | Extend the target hook to take also the operand index into account when calculating the cost of the constant materialization. Related to <rdar://problem/16381500> llvm-svn: 204435
* [Constant Hoisting] Change the algorithm to only track constants for ↵Juergen Ributzka2014-03-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | instructions. Originally the algorithm would search for expensive constants and track their users, which could be instructions and constant expressions. This change only tracks the constants for instructions, but constant expressions are indirectly covered too. If an operand is an constant expression, then we look through the expression to find anny expensive constants. The algorithm keep now track of the instruction and the operand index where the constant is used. This allows more precise hoisting of constant materialization code for PHI instructions, because we only hoist to the basic block of the incoming operand. Before we had to find the idom of all PHI operands and hoist the materialization code there. This also makes updating of instructions easier. Before we had to keep track of the original constant, find it in the instructions, and then replace it. Now we can just simply update the operand. Related to <rdar://problem/16381500> llvm-svn: 204433
* Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass."Juergen Ributzka2014-03-201-5/+5
| | | | | | I will break this up into smaller pieces for review and recommit. llvm-svn: 204393
* [Constant Hoisting] Extend coverage of the constant hoisting pass.Juergen Ributzka2014-03-201-5/+5
| | | | | | | | | This commit extends the coverage of the constant hoisting pass, adds additonal debug output and updates the function names according to the style guide. Related to <rdar://problem/16381500> llvm-svn: 204389
* Remove LowerInvoke's obsolete "-enable-correct-eh-support" optionMark Seaborn2014-03-205-91/+0
| | | | | | | | | | | | | | | This option caused LowerInvoke to generate code using SJLJ-based exception handling, but there is no code left that interprets the jmp_buf stack that the resulting code maintained (llvm.sjljeh.jblist). This option has been obsolete for a while, and replaced by SjLjEHPrepare. This leaves the default behaviour of LowerInvoke, which is to convert invokes to calls. Differential Revision: http://llvm-reviews.chandlerc.com/D3136 llvm-svn: 204388
* Add a test for LowerInvoke that doesn't use "-enable-correct-eh-support"Mark Seaborn2014-03-201-0/+25
| | | | | | | | | | | | | None of the existing tests for LowerInvoke check LowerInvoke's output, and all but one use "-enable-correct-eh-support", which is obsolete, so those tests will be removed when that option is removed. To make sure LowerInvoke will still have test coverage, this adds a test for its default mode which converts invokes to calls. Differential Revision: http://llvm-reviews.chandlerc.com/D3124 llvm-svn: 204344
* Fix use_iterator crash in ObjCArc from r203364Duncan P. N. Exon Smith2014-03-181-0/+30
| | | | | | | | | | The use_iterator redesign in r203364 introduced an increment past the end of a range in -objc-arc-contract. Added an explicit check for the end of the range. <rdar://problem/16333235> llvm-svn: 204195
* Tolerate unmangled names in sample profiles.Diego Novillo2014-03-183-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The compiler does not always generate linkage names. If a function has been inlined and its body elided, its linkage name may not be generated. When the binary executes, the profiler will use its unmangled name when attributing samples. This results in unmangled names in the input profile. We are currently failing hard when this happens. However, in this case all that happens is that we fail to attribute samples to the inlined function. While this means fewer optimization opportunities, it should not cause a compilation failure. This patch accepts all valid function names, regardless of whether they were mangled or not. Reviewers: chandlerc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3087 llvm-svn: 204142
* Use range metadata instead of introducing selects.Dan Gohman2014-03-172-4/+57
| | | | | | | | | | | | | | | | When GlobalOpt has determined that a GlobalVariable only ever has two values, it would convert the GlobalVariable to a boolean, and introduce SelectInsts at every load, to choose between the two possible values. These SelectInsts introduce overhead and other unpleasantness. This patch makes GlobalOpt just add range metadata to loads from such GlobalVariables instead. This enables the same main optimization (as seen in test/Transforms/GlobalOpt/integer-bool.ll), without introducing selects. The main downside is that it doesn't get the memory savings of shrinking such GlobalVariables, but this is expected to be negligible. llvm-svn: 204076
* llvm/test/Transforms/SampleProfile/syntax.ll: Suppress checking the message ↵NAKAMURA Takumi2014-03-151-1/+1
| | | | | | catalog in ENOENT. It is locale-dependent on Windows. llvm-svn: 203997
* Use DiagnosticInfo facility.Diego Novillo2014-03-141-7/+7
| | | | | | | | | | | | | | | | | | Summary: The sample profiler pass emits several error messages. Instead of just aborting the compiler with report_fatal_error, we can emit better messages using DiagnosticInfo. This adds a new sub-class of DiagnosticInfo to handle the sample profiler. Reviewers: chandlerc, qcolombet CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3086 llvm-svn: 203976
* Remove the linker_private and linker_private_weak linkages.Rafael Espindola2014-03-132-25/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by *name*. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). llvm-svn: 203866
* Fix a bug in InstCombine where we would incorrectly attempt to construct aOwen Anderson2014-03-131-0/+12
| | | | | | | bitcast between pointers of two different address spaces if they happened to have the same pointer size. llvm-svn: 203862
* CodeGenPrep: sink extends of illegal types into use block.Manuel Jacob2014-03-131-0/+64
| | | | | | | | | | | | | | | | | | | Summary: This helps the instruction selector to lower an i64 * i64 -> i128 multiplication into a single instruction on targets which support it. This is an update of D2973 which was reverted because of a bug reported as PR19084. Reviewers: t.p.northover, chapuni Reviewed By: t.p.northover CC: llvm-commits, alex, chapuni Differential Revision: http://llvm-reviews.chandlerc.com/D3021 llvm-svn: 203797
* Fix PR18800. llvm intrinsic memcpy takes 5 arguments void ↵Karthik Bhat2014-03-132-9/+7
| | | | | | @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, i32 <len>, i32 <align>, i1 <isvolatile>).The test case incorrectly uses the old format resulting in isVolatile function in MemIntrinsic to crash during SROA transformation.Modified the test case to use correct signature of memcpy and memset. llvm-svn: 203750
* This test need the X86 backend, move it to the X86 sub directory.Rafael Espindola2014-03-121-0/+0
| | | | llvm-svn: 203725
* PR17473:Michael Zolotukhin2014-03-121-0/+67
| | | | | | | Don't normalize an expression during postinc transformation unless it's invertible. llvm-svn: 203719
* Resubmit "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..."Raul E. Silvera2014-03-121-0/+75
| | | | | | | This reverts commit 86cb795388643710dab34941ddcb5a9470ac39d8. The problems previously found have been resolved through other CLs. llvm-svn: 203707
* Reject alias to undefined symbols in the verifier.Rafael Espindola2014-03-126-20/+56
| | | | | | | | | | | | | | | On ELF and COFF an alias is just another name for a position in the file. There is no way to refer to a position in another file, so an alias to undefined is meaningless. MachO currently doesn't support aliases. The spec has a N_INDR, which when implemented will have a different set of restrictions. Adding support for it shouldn't be harder than any other IR extension. For now, having the IR represent what is actually possible with current tools makes it easier to fix the design of GlobalAlias. llvm-svn: 203705
* Allow switch-to-lookup table for tables with holes by adding bitmask checkHans Wennborg2014-03-121-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | This allows us to generate table lookups for code such as: unsigned test(unsigned x) { switch (x) { case 100: return 0; case 101: return 1; case 103: return 2; case 105: return 3; case 107: return 4; case 109: return 5; case 110: return 6; default: return f(x); } } Since cases 102, 104, etc. are not constants, the lookup table has holes in those positions. We therefore guard the table lookup with a bitmask check. Patch by Jasper Neumann! llvm-svn: 203694
* Revert r203488 and r203520.Evan Cheng2014-03-121-23/+0
| | | | llvm-svn: 203687
* Fix crash in PRE.Erik Verbruggen2014-03-111-0/+24
| | | | | | | | | | After r203553 overflow intrinsics and their non-intrinsic (normal) instruction get hashed to the same value. This patch prevents PRE from moving an instruction into a predecessor block, and trying to add a phi node that gets two different types (the intrinsic result and the non-intrinsic result), resulting in a failing assert. llvm-svn: 203574
* IR: add a second ordering operand to cmpxhg for failureTim Northover2014-03-112-2/+2
| | | | | | | | | | | | | | | The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
* GVN: merge overflow intrinsics with non-overflow instructions.Erik Verbruggen2014-03-111-0/+43
| | | | | | | | | | | | | | | | | | | When an overflow intrinsic is followed by a non-overflow instruction, replace the latter with an extract. For example: %sadd = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) %sadd3 = add i32 %a, %b Here the add statement will be replaced by an extract. When an overflow intrinsic follows a non-overflow instruction, a clone of the intrinsic is inserted before the normal instruction, which makes it the same as the previous case. Subsequent runs of GVN can then clean up the duplicate instructions and insert the extract. This fixes PR8817. llvm-svn: 203553
* Use discriminator information in sample profiles.Diego Novillo2014-03-1010-51/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When the sample profiles include discriminator information, use the discriminator values to distinguish instruction weights in different basic blocks. This modifies the BodySamples mapping to map <line, discriminator> pairs to weights. Instructions on the same line but different blocks, will use different discriminator values. This, in turn, means that the blocks may have different weights. Other changes in this patch: - Add tests for positive values of line offset, discriminator and samples. - Change data types from uint32_t to unsigned and int and do additional validation. Reviewers: chandlerc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2857 llvm-svn: 203508
* MemCpyOpt: When merging memsets also merge the trivial case of two memsets ↵Benjamin Kramer2014-03-101-0/+12
| | | | | | | | with the same destination. The testcase is from PR19092, but I think the bug described there is actually a clang issue. llvm-svn: 203489
* For functions with ARM target specific calling convention, when simplify-libcallEvan Cheng2014-03-101-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optimize a call to a llvm intrinsic to something that invovles a call to a C library call, make sure it sets the right calling convention on the call. e.g. extern double pow(double, double); double t(double x) { return pow(10, x); } Compiles to something like this for AAPCS-VFP: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x) ret double %0 } declare double @llvm.pow.f64(double, double) #1 Simplify libcall (part of instcombine) will turn the above into: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %__exp10 = call double @__exp10(double %x) #1 ret double %__exp10 } declare double @__exp10(double) The pre-instcombine code works because calls to LLVM builtins are special. Instruction selection will chose the right calling convention for the call. However, the code after instcombine is wrong. The call to __exp10 will use the C calling convention. I can think of 3 options to fix this. 1. Make "C" calling convention just work since the target should know what CC is being used. This doesn't work because each function can use different CC with the "pcs" attribute. 2. Have Clang add the right CC keyword on the calls to LLVM builtin. This will work but it doesn't match the LLVM IR specification which states these are "Standard C Library Intrinsics". 3. Fix simplify libcall so the resulting calls to the C routines will have the proper CC keyword. e.g. %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1 This works and is the solution I implemented here. Both solutions #2 and #3 would work. After carefully considering the pros and cons, I decided to implement #3 for the following reasons. 1. It doesn't change the "spec" of the intrinsics. 2. It's a self-contained fix. There are a couple of potential downsides. 1. There could be other places in the optimizer that is broken in the same way that's not addressed by this. 2. There could be other calling conventions that need to be propagated by simplify-libcall that's not handled. But for now, this is the fix that I'm most comfortable with. llvm-svn: 203488
* Revert r203230, "CodeGenPrep: sink extends of illegal types into use block."NAKAMURA Takumi2014-03-091-46/+0
| | | | | | It choked i686 stage2. llvm-svn: 203386
* IR: Change inalloca's grammar a bitDavid Majnemer2014-03-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The grammar for LLVM IR is not well specified in any document but seems to obey the following rules: - Attributes which have parenthesized arguments are never preceded by commas. This form of attribute is the only one which ever has optional arguments. However, not all of these attributes support optional arguments: 'thread_local' supports an optional argument but 'addrspace' does not. Interestingly, 'addrspace' is documented as being a "qualifier". What constitutes a qualifier? I cannot find a definition. - Some attributes use a space between the keyword and the value. Examples of this form are 'align' and 'section'. These are always preceded by a comma. - Otherwise, the attribute has no argument. These attributes do not have a preceding comma. Sometimes an attribute goes before the instruction, between the instruction and it's type, or after it's type. 'atomicrmw' has 'volatile' between the instruction and the type while 'call' has 'tail' preceding the instruction. With all this in mind, it seems most consistent for 'inalloca' on an 'inalloca' instruction to occur before between the instruction and the type. Unlike the current formulation, there would be no preceding comma. The combination 'alloca inalloca' doesn't look particularly appetizing, perhaps a better spelling of 'inalloca' is down the road. llvm-svn: 203376
OpenPOWER on IntegriCloud