summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
...
* Teach InstCombine to optimize extract of a value from a vector add operation ↵Nadav Rotem2013-01-151-0/+10
| | | | | | with a constant zero. llvm-svn: 172576
* 1. Hoist minus sign as high as possible in an attempt to revealShuxin Yang2013-01-152-33/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | some optimization opportunities (in the enclosing supper-expressions). rule 1. (-0.0 - X ) * Y => -0.0 - (X * Y) if expression "-0.0 - X" has only one reference. rule 2. (0.0 - X ) * Y => -0.0 - (X * Y) if expression "0.0 - X" has only one reference, and the instruction is marked "noSignedZero". 2. Eliminate negation (The compiler was already able to handle these opt if the 0.0s are replaced with -0.0.) rule 3: (0.0 - X) * (0.0 - Y) => X * Y rule 4: (0.0 - X) * C => X * -C if the expr is flagged "noSignedZero". 3. Rule 5: (X*Y) * X => (X*X) * Y if X!=Y and the expression is flagged with "UnsafeAlgebra". The purpose of this transformation is two-fold: a) to form a power expression (of X). b) potentially shorten the critical path: After transformation, the latency of the instruction Y is amortized by the expression of X*X, and therefore Y is in a "less critical" position compared to what it was before the transformation. 4. Remove the InstCombine code about simplifiying "X * select". The reasons are following: a) The "select" is somewhat architecture-dependent, therefore the higher level optimizers are not able to precisely predict if the simplification really yields any performance improvement or not. b) The "select" operator is bit complicate, and tends to obscure optimization opportunities. It is btter to keep it as low as possible in expr tree, and let CodeGen to tackle the optimization. llvm-svn: 172551
* [IR] Add verification for module flags with the "require" behavior.Daniel Dunbar2013-01-151-2/+16
| | | | llvm-svn: 172549
* [msan] Temporarily remove ICmpEQ tests.Evgeniy Stepanov2013-01-151-54/+0
| | | | | | They are failing on the bots. llvm-svn: 172540
* [msan] Fix handling of equality comparison of pointer vectors.Evgeniy Stepanov2013-01-151-0/+70
| | | | | | Also improve test coveration of the handling of relational comparisons. llvm-svn: 172539
* Pattern-matched variables in post-inc-icmpzero.llRenato Golin2013-01-151-4/+4
| | | | | | | | | | | Test was failing for clang-native-arm-cortex-a9 build-bot configuration. The reason for the failure was the test was using hardcoded names. The attached patch fixes this failure by replacing the hard-coded variables names with pattern-matched variable names. Patch by Manish Verma, ARM llvm-svn: 172534
* [IR] Add verifier support for llvm.module.flags.Daniel Dunbar2013-01-151-0/+37
| | | | | | | - Also, update the LangRef documentation on module flags to match the implementation. llvm-svn: 172498
* This patch fixes a Mips specific bug where Jack Carter2013-01-151-0/+37
| | | | | | | | | | | we need to generate a N64 compound relocation R_MIPS_GPREL_32/R_MIPS_64/R_MIPS_NONE. The bug was exposed by the SingleSourcetest case DuffsDevice.c. Contributer: Jack Carter llvm-svn: 172496
* This change is to implement following rules under the condition C_A and/or C_RShuxin Yang2013-01-141-0/+96
| | | | | | | | | | | | | | | | | | | | | --------------------------------------------------------------------------- C_A: reassociation is allowed C_R: reciprocal of a constant C is appropriate, which means - 1/C is exact, or - reciprocal is allowed and 1/C is neither a special value nor a denormal. ----------------------------------------------------------------------------- rule1: (X/C1) / C2 => X / (C2*C1) (if C_A) => X * (1/(C2*C1)) (if C_A && C_R) rule 2: X*C1 / C2 => X * (C1/C2) if C_A rule 3: (X/Y)/Z = > X/(Y*Z) (if C_A && at least one of Y and Z is symbolic value) rule 4: Z/(X/Y) = > (Z*Y)/X (similar to rule3) rule 5: C1/(X*C2) => (C1/C2) / X (if C_A) rule 6: C1/(X/C2) => (C1*C2) / X (if C_A) rule 7: C1/(C2/X) => (C1/C2) * X (if C_A) llvm-svn: 172488
* [ms-inline asm] Extend support for parsing Intel bracketed memory operands thatChad Rosier2013-01-141-9/+191
| | | | | | | have an arbitrary ordering of the base register, index register and displacement. rdar://12527141 llvm-svn: 172484
* This patch addresses an incorrect transformation in the DAG combiner.Bill Schmidt2013-01-141-0/+34
| | | | | | | | | | | | | | | | | | | | | | The included test case is derived from one of the GCC compatibility tests. The problem arises after the selection DAG has been converted to type-legalized form. The combiner first sees a 64-bit load that can be converted into a pre-increment form. The original load feeds into a SRL that isolates the upper 32 bits of the loaded doubleword. This looks like an opportunity for DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load. However, this transformation is not valid, as the replacement load is not a pre-increment load. The pre-increment load produces an extra result, which feeds a subsequent add instruction. The replacement load only has one result value, and this value is propagated to all uses of the pre- increment load, including the add. Because the add is looking for the second result value as its operand, it ends up attempting to add a constant to a token chain, resulting in a crash. So the patch simply disables this transformation for any load with more than two result values. llvm-svn: 172480
* SCEVExpander fix. RAUW needs to update the InsertedExpressions cache.Andrew Trick2013-01-141-0/+84
| | | | | | | | Note that this bug is only exposed because LTO fails to use TTI. Fixes self-LTO of clang. rdar://13007381. llvm-svn: 172462
* Added bugzilla PR number to test case.Michael Gottesman2013-01-131-0/+1
| | | | llvm-svn: 172369
* Fixed an infinite loop in the block escape in analysis in ObjCARC caused by ↵Michael Gottesman2013-01-131-0/+86
| | | | | | | | 2x blocks each assigned a value via a phi-node causing each to depend on the other. A test case is provided as well. llvm-svn: 172368
* X86: Add patterns for X86ISD::VSEXT in registers.Benjamin Kramer2013-01-131-0/+176
| | | | | | | Those can occur when something between the sextload and the store is on the same chain and blocks isel. Fixes PR14887. llvm-svn: 172353
* Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 ↵Nadav Rotem2013-01-131-0/+35
| | | | | | and i16). llvm-svn: 172348
* When lowering an inreg sext first shift left, then right arithmetically.Benjamin Kramer2013-01-121-3/+3
| | | | | | | Shifting right two times will only yield zero. Should fix SingleSource/UnitTests/SignlessTypes/factor. llvm-svn: 172322
* Fixed bug in ObjCARC where we were changing a call from objc_autoreleaseRV ↵Michael Gottesman2013-01-121-1/+1
| | | | | | => objc_autorelease but were not updating the InstructionClass to IC_Autorelease. llvm-svn: 172288
* Fixed a bug where we were tail calling objc_autorelease causing an object to ↵Michael Gottesman2013-01-125-13/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | not be placed into an autorelease pool. The reason that this occurs is that tail calling objc_autorelease eventually tail calls -[NSObject autorelease] which supports fast autorelease. This can cause us to violate the semantic gaurantees of __autoreleasing variables that assignment to an __autoreleasing variables always yields an object that is placed into the innermost autorelease pool. The fix included in this patch works by: 1. In the peephole optimization function OptimizeIndividualFunctions, always remove tail call from objc_autorelease. 2. Whenever we convert to/from an objc_autorelease, set/unset the tail call keyword as appropriate. *NOTE* I also handled the case where objc_autorelease is converted in OptimizeReturns to an autoreleaseRV which still violates the ARC semantics. I will be removing that in a later patch and I wanted to make sure that the tree is in a consistent state vis-a-vis ARC always. Additionally some test cases are provided and all tests that have tail call marked objc_autorelease keywords have been modified so that tail call has been removed. *NOTE* One test fails due to a separate bug that I am going to commit soon. Thus I marked the check line TMP: instead of CHECK: so make check does not fail. llvm-svn: 172287
* This patch tackles the problem of parsing Mips Jack Carter2013-01-122-1/+95
| | | | | | | | | | | | | | | | | | register names in the standalone assembler llvm-mc. Registers such as $A1 can represent either a 32 or 64 bit register based on the instruction using it. In addition, based on the abi, $T0 can represent different 32 bit registers. The problem is resolved by the Mips specific AsmParser td definitions changing to work together. Many cases of RegisterClass parameters are now RegisterOperand. Contributer: Vladimir Medic llvm-svn: 172284
* PPC: Implement efficient lowering of sign_extend_inreg.Nadav Rotem2013-01-111-87/+9
| | | | llvm-svn: 172269
* Update patch for the pad short functions pass for Intel Atom (only).Preston Gurd2013-01-111-0/+25
| | | | | | | | | Adds a check for -Oz, changes the code to not re-visit BBs, and skips over DBG_VALUE instrs. Patch by Andy Zhang. llvm-svn: 172258
* ARM Cost Model: Modify the target independent cost model to askNadav Rotem2013-01-111-3/+3
| | | | | | | | the target if it supports the different CAST types. We didn't do this on X86 because of the different register sizes and types, but on ARM this makes sense. llvm-svn: 172245
* For inline asm:Eric Christopher2013-01-111-0/+21
| | | | | | | | | | | - recognize string "{memory}" in the MI generation - mark as mayload/maystore when there's a memory clobber constraint. PR14859. Patch by Krzysztof Parzyszek llvm-svn: 172228
* Simplify writing floating types to assembly.Tim Northover2013-01-114-18/+74
| | | | | | | This removes previous special cases for each floating-point type in favour of a shared codepath. llvm-svn: 172189
* ARM Cost Model: We need to detect the max bitwidth of types in the loop in ↵Nadav Rotem2013-01-111-0/+52
| | | | | | | | | | | order to select the max vectorization factor. We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector zext/sext/trunc operations. llvm-svn: 172178
* Converted test dont-tce-tail-marked-call.ll to use FileCheck.Michael Gottesman2013-01-111-2/+2
| | | | llvm-svn: 172172
* This commit is a 4x squash commit consisting of 4x functions converted to ↵Michael Gottesman2013-01-114-6/+12
| | | | | | | | | | | | use FileCheck instead of grep. Messages: Converted test case trivial_codegen_tailcall.ll to use FileCheck. Converted test return_constant.ll to use FileCheck instead of grep. Converted test reorder_load.ll to use FileCheck instead of grep. Converted test intervening-inst.ll to use FileCheck instead of grep. llvm-svn: 172171
* PR14904: Segmentation fault running pass 'Recognize loop idioms'Shuxin Yang2013-01-101-0/+20
| | | | | | | | The root cause is mistakenly taking for granted that "dyn_cast<Instruction>(a-Value)" return a non-NULL instruction. llvm-svn: 172145
* CastInst::castIsValid should return true if the dest type is the same asEvan Cheng2013-01-101-0/+36
| | | | | | Value's current type. The casting is trivial even for aggregate type. llvm-svn: 172143
* llvm/test/CodeGen/X86/ms-inline-asm.ll: Fixup; Globals doesn't have leading ↵NAKAMURA Takumi2013-01-101-2/+2
| | | | | | underscore in symbol on linux. llvm-svn: 172139
* [llvm-objdump] Emit addresses with the correct number of leading 0's.Michael J. Spencer2013-01-101-8/+8
| | | | llvm-svn: 172130
* [msan] Change va_start/va_copy shadow memset alignment to 8.Peter Collingbourne2013-01-101-0/+13
| | | | | | | | | This fixes va_start/va_copy of a va_list field which happens to not be laid out at a 16-byte boundary. Differential Revision: http://llvm-reviews.chandlerc.com/D276 llvm-svn: 172128
* PR14896: Handle memcpy from constant string where the memcpy size is larger ↵Evan Cheng2013-01-101-0/+13
| | | | | | than the string size. llvm-svn: 172124
* [ms-inline asm] Add support for calling functions from inline assembly.Chad Rosier2013-01-101-0/+18
| | | | | | Part of rdar://12991541 llvm-svn: 172121
* Teach InstCombine to hoist FABS and FNEG through FPTRUNC instructions. The ↵Owen Anderson2013-01-101-0/+19
| | | | | | application of these operations commutes with the truncation, so we should prefer to do them in the smallest size we can, to save register space, use smaller constant pool entries, etc. llvm-svn: 172117
* LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The ↵Nadav Rotem2013-01-101-0/+25
| | | | | | | | BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals. PR14878 llvm-svn: 172079
* Fix a copy/paste error in the IR Linker, casting an ArrayType instead of a ↵Joey Gouly2013-01-102-0/+9
| | | | | | VectorType. llvm-svn: 172054
* Fix TryToShrinkGlobalToBoolean in GlobalOpt, so that it does not discard ↵Joey Gouly2013-01-101-11/+16
| | | | | | address spaces. llvm-svn: 172051
* Stack Alignment: throw error if we can't satisfy the minimal alignmentManman Ren2013-01-102-1/+20
| | | | | | | | | | | | | | | | | | requirement when creating stack objects in MachineFrameInfo. Add CreateStackObjectWithMinAlign to throw error when the minimal alignment can't be achieved and to clamp the alignment when the preferred alignment can't be achieved. Same is true for CreateVariableSizedObject. Will not emit error in CreateSpillStackObject or CreateStackObject. As long as callers of CreateStackObject do not assume the object will be aligned at the requested alignment, we should not have miscompile since later optimizations which look at the object's alignment will have the correct information. rdar://12713765 llvm-svn: 172027
* ARM Cost model: Use the size of vector registers and widest vectorizable ↵Nadav Rotem2013-01-093-2/+62
| | | | | | instruction to determine the max vectorization factor. llvm-svn: 172010
* Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y).Evan Cheng2013-01-091-0/+41
| | | | | | | | | It cahced XOR's operands before calling visitXOR() but failed to update the operands when visitXOR changed the XOR node. rdar://12968664 llvm-svn: 171999
* LICM: Hoist insertvalue/extractvalue out of loops.Benjamin Kramer2013-01-091-0/+26
| | | | | | Fixes PR14854. llvm-svn: 171984
* PowerPC: EH adjustmentsAdhemerval Zanella2013-01-091-3/+3
| | | | | | | | | This patch adjust the r171506 to make all DWARF enconding pc-relative for PPC64. It also adds the R_PPC64_REL32 relocation handling in MCJIT (since the eh_frame will not generate PIC-relative relocation) and also adds the emission of stubs created by the TTypeEncoding. llvm-svn: 171979
* add -march to the testNadav Rotem2013-01-091-1/+1
| | | | llvm-svn: 171956
* Efficient lowering of vector sdiv when the divisor is a splatted power of ↵Nadav Rotem2013-01-091-0/+72
| | | | | | | | | | | two constant. PR 14848. The lowered sequence is based on the existing sequence the target-independent DAG Combiner creates for the scalar case. Patch by Zvi Rackover. llvm-svn: 171953
* MIsched: add an ILP window property to machine model.Andrew Trick2013-01-091-13/+20
| | | | | | | | | | This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. llvm-svn: 171946
* ARM Cost Model: Add a basic vectorization unrolling test.Nadav Rotem2013-01-091-3/+10
| | | | llvm-svn: 171931
* Remove the -licm pass from the loop vectorizer test because the loop ↵Nadav Rotem2013-01-0923-25/+25
| | | | | | vectorizer does it now. llvm-svn: 171930
* Cost Model: Move the 'max unroll factor' variable to the TTI and add initial ↵Nadav Rotem2013-01-093-2/+31
| | | | | | Cost Model support on ARM. llvm-svn: 171928
OpenPOWER on IntegriCloud