summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Typo.Chad Rosier2012-08-301-1/+1
| | | | llvm-svn: 162952
* Currently targets that do not support selects with scalar conditions and ↵Nadav Rotem2012-08-301-1/+65
| | | | | | | | | | | vector operands - scalarize the code. ARM is such a target because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2). rdar://12201387 llvm-svn: 162926
* Introduce 'UseSSEx' to force SSE legacy encodingMichael Liao2012-08-305-122/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is enabled. As the penalty of inter-mixing SSE and AVX instructions, we need prevent SSE legacy insn from being generated except explicitly specified through some intrinsics. For patterns supported by both SSE and AVX, so far, we force AVX insn will be tried first relying on AddedComplexity or position in td file. It's error-prone and introduces bugs accidentally. 'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited by AVX, we need this predicate to force VEX encoding or SSE legacy encoding only. For insns not inherited by AVX, we still use the previous predicates, i.e. 'HasSSEx'. So far, these insns fall into the following categories: * SSE insns with MMX operands * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH, CRC, and etc.) * SSE4A insns. * MMX insns. * x87 insns added by SSE. 2 test cases are modified: - test/CodeGen/X86/fast-isel-x86-64.ll AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be selected by fast-isel due to complicated pattern and fast-isel fallback to materialize it from constant pool. - test/CodeGen/X86/widen_load-1.ll AVX code generation is different from SSE one after fixing SSE/AVX inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of 'vmovaps'. llvm-svn: 162919
* Apply "/Og-" also to MSC15(aka VS9) on VMCore/Function.cpp.NAKAMURA Takumi2012-08-301-1/+1
| | | | llvm-svn: 162917
* PPCISelLowering.cpp: Fix r162725.NAKAMURA Takumi2012-08-301-1/+5
| | | | | | | | [Tobias von Koch] What's happening here is that the CR6SET/CR6UNSET is breaking the chain of register copies glued to the function call (BL_SVR4 node). The scheduler then moves other instructions in between those and the function call, which isn't good! Right. That's the case where there is no chain of register copies before the call, so InFlag == 0... Attached is a new revision of the patch which should fix this for good. llvm-svn: 162916
* PPCISelLowering.cpp: Whitespace.NAKAMURA Takumi2012-08-301-1/+1
| | | | llvm-svn: 162915
* testMichael Ilseman2012-08-301-2/+2
| | | | llvm-svn: 162914
* LoopRotate: Also rotate loops with multiple exits.Benjamin Kramer2012-08-301-13/+60
| | | | | | | | | | | | | | | The old PHI updating code in loop-rotate was replaced with SSAUpdater a while ago, it has no problems with comples PHIs. What had to be fixed is detecting whether a loop was already rotated and updating dominators when multiple exits were present. This change increases overall code size a bit, mostly due to additional loop unrolling opportunities. Passes test-suite and selfhost with -verify-dom-info. Fixes PR7447. Thanks to Andy for the input on the domtree updating code. llvm-svn: 162912
* InstCombine: Fix comment to reflect the code.Benjamin Kramer2012-08-301-1/+1
| | | | llvm-svn: 162911
* Don't use MCInstrDesc flags for implicit operands.Jakob Stoklund Olesen2012-08-301-11/+16
| | | | | | | | | | | | | When a MachineInstr is constructed, its implicit operands are added first, then the explicit operands are inserted before the implicits. MCInstrDesc has oprand flags like early clobber and operand ties that apply to the explicit operands. Don't look at those flags when the implicit operands are first added in the explicit operands's positions. llvm-svn: 162910
* WhitespaceAlexey Samsonov2012-08-301-3/+3
| | | | llvm-svn: 162907
* It is illegal to transform (sdiv (ashr X c1) c2) -> (sdiv x (2^c1 * c2)),Nadav Rotem2012-08-301-10/+0
| | | | | | | | because C always rounds towards zero. Thanks Dirk and Ben. llvm-svn: 162899
* Add support for moving pure S-register to NEON pipeline if desiredTim Northover2012-08-301-2/+71
| | | | llvm-svn: 162898
* Refactor fetching file/line info from DWARFContext to simplify theAlexey Samsonov2012-08-304-63/+127
| | | | | | | | code and allow better code reuse. Make the code a bit more conforming to LLVM code style. No functionality change. llvm-svn: 162895
* Add FMA to switch statement in VectorLegalizer::LegalizeOp so that it can be ↵Craig Topper2012-08-301-0/+1
| | | | | | expanded when it isn't legal. llvm-svn: 162894
* Add support for FMA to WidenVectorResult.Craig Topper2012-08-302-0/+14
| | | | llvm-svn: 162893
* Only perform DAG combine on FMAs of legal types.Craig Topper2012-08-301-0/+4
| | | | llvm-svn: 162892
* Pass by pointer and not std::string.Bill Wendling2012-08-301-2/+2
| | | | llvm-svn: 162888
* Revert r162855 in favor of changing clang to emit the absolute coverage file ↵Bill Wendling2012-08-301-19/+7
| | | | | | path. llvm-svn: 162883
* Fix PR13727Michael Liao2012-08-301-3/+7
| | | | | | | | | | | - The root cause is that target constant materialization in X86 fast-isel creates a PC-rel addressing which may overflow 32-bit range in non-Small code model if .rodata section is allocated too far away from code segment in MCJIT, which uses Large code model so far. - Follow the similar logic to fix non-Small code model in fast-isel by skipping non-Small code model. llvm-svn: 162881
* Verify the order of tied operands in inline asm.Jakob Stoklund Olesen2012-08-291-0/+12
| | | | | | | | | | | | | | When there are multiple tied use-def pairs on an inline asm instruction, the tied uses must appear in the same order as the defs. It is possible to write an LLVM IR inline asm instruction that breaks this constraint, but there is no reason for a front end to emit the operands out of order. The gnu inline asm syntax specifies tied operands as a single read/write constraint "+r", so ouf of order operands are not possible. llvm-svn: 162878
* Add some __builtin_expect magic to StringMap.Benjamin Kramer2012-08-291-4/+5
| | | | | | | | | Tombstones and full hash collisions are rare, mark the "empty" and "no collision" paths as likely. The bug in simplifycfg that prevented the hints from being picked during selfhost up was fixed recently :) llvm-svn: 162874
* Replace the BUILTIN_EXPECT macro with a less horrible ↵Benjamin Kramer2012-08-291-6/+6
| | | | | | LLVM_LIKELY/LLVM_UNLIKELY interface. llvm-svn: 162873
* Allow targets to specify a minimum supported NOP size when performing NOP ↵Owen Anderson2012-08-291-0/+6
| | | | | | | | padding. If the desired padding is smaller than the supported NOP size, we will enlarge the padding to make it work. llvm-svn: 162870
* Set the isTied flags when building INLINEASM MachineInstrs.Jakob Stoklund Olesen2012-08-291-4/+21
| | | | | | | | For normal instructions, isTied() is set automatically by addOperand(), based on MCInstrDesc, but inline asm has tied operands outside the descriptor. llvm-svn: 162869
* Preserve branch profile metadata during switch formation.Andrew Trick2012-08-291-0/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch by Michael Ilseman! This fixes SimplifyCFGOpt::FoldValueComparisonIntoPredecessors to preserve metata when folding conditional branches into switches. void foo(int x) { if (x == 0) bar(1); else if (__builtin_expect(x == 10, 1)) bar(2); else if (x == 20) bar(3); } CFG: B0 | \ | X0 B10 | \ | X10 B20 | \ E X20 Merge B0-B10: w(B0-X0) = w(B0-X0)*sum-weights(B10) = w(B0-X0) * (w(B10-X10) + w(B10-B20)) w(B0-X10) = w(B0-B10) * w(B10-X10) w(B0-B20) = w(B0-B10) * w(B10-B20) B0 __ | \ \ | X10 X0 B20 | \ E X20 Merge B0-B20: w(B0-X0) = w(B0-X0) * sum-weights(B20) = w(B0-X0) * (w(B20-E) + w(B20-X20)) w(B0-X10) = w(B0-X10) * sum-weights(B20) = ... w(B0-X20) = w(B0-B20) * w(B20-X20) w(B0-E) = w(B0-B20) * w(B20-E) llvm-svn: 162868
* whitespaceAndrew Trick2012-08-291-168/+168
| | | | llvm-svn: 162867
* Rename hasVolatileMemoryRef() to hasOrderedMemoryRef().Jakob Stoklund Olesen2012-08-293-11/+11
| | | | | | | | Ordered memory operations are more constrained than volatile loads and stores because they must be ordered with respect to all other memory operations. llvm-svn: 162861
* Don't move normal loads across volatile/atomic loads.Jakob Stoklund Olesen2012-08-291-3/+8
| | | | | | | | | | | | | It is technically allowed to move a normal load across a volatile load, but probably not a good idea. It is not allowed to move a load across an atomic load with Ordering > Monotonic, and we model those with MOVolatile as well. I recently removed the mayStore flag from atomic load instructions, so they don't need a pseudo-opcode. This patch makes up for the difference. llvm-svn: 162857
* Use the full path to output the .gcda file.Bill Wendling2012-08-291-7/+19
| | | | | | | | This lets the user run the program from a different directory and still have the .gcda files show up in the correct place. <rdar://problem/12179524> llvm-svn: 162855
* Reserve space for the mandatory traceback fields on PPC64.Hal Finkel2012-08-291-4/+8
| | | | | | | | | | | | | | | | | | | | | We need to reserve space for the mandatory traceback fields, though leaving them as zero is appropriate for now. Although the ABI calls for these fields to be filled in fully, no compiler on Linux currently does this, and GDB does not read these fields. GDB uses the first word of zeroes during exception handling to find the end of the function and the size field, allowing it to compute the beginning of the function. DWARF information is used for everything else. We need the extra 8 bytes of pad so the size field is found in the right place. As a comparison, GCC fills in a few of the fields -- language, number of saved registers -- but ignores the rest. IBM's proprietary OSes do make use of the full traceback table facility. Patch by Bill Schmidt. llvm-svn: 162854
* Use ArrayRef instead of SmallVector when passing vector into function.Bill Wendling2012-08-291-4/+3
| | | | llvm-svn: 162851
* Verify the consistency of inline asm operands.Jakob Stoklund Olesen2012-08-291-16/+72
| | | | | | | | | | The operands on an INLINEASM machine instruction are divided into groups headed by immediate flag operands. Verify this structure. Extract verifyTiedOperands(), and only call it for non-inlineasm instructions. llvm-svn: 162849
* Clean this up slightly, doesn't really fall through.Eric Christopher2012-08-291-2/+1
| | | | llvm-svn: 162848
* Refactor setExecutionDomain to be clearer about what it's doing and more robust.Tim Northover2012-08-291-45/+53
| | | | llvm-svn: 162844
* Make helper function static.Benjamin Kramer2012-08-291-1/+1
| | | | llvm-svn: 162843
* Make MemoryBuiltins aware of TargetLibraryInfo.Benjamin Kramer2012-08-2925-155/+237
| | | | | | | | | | | | | | | | This disables malloc-specific optimization when -fno-builtin (or -ffreestanding) is specified. This has been a problem for a long time but became more severe with the recent memory builtin improvements. Since the memory builtin functions are used everywhere, this required passing TLI in many places. This means that functions that now have an optional TLI argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead mallocs anymore if the TLI argument is missing. I've updated most passes to do the right thing. Fixes PR13694 and probably others. llvm-svn: 162841
* Convert FMA4 patterns to use target specific nodes instead of intrinsics to ↵Craig Topper2012-08-293-38/+36
| | | | | | align with FMA3. llvm-svn: 162829
* Add virtual keywords for methods that override the base class.Craig Topper2012-08-291-8/+8
| | | | llvm-svn: 162826
* Cleanup sloppy code. Jakob's review.Andrew Trick2012-08-291-4/+3
| | | | llvm-svn: 162825
* [arm-fast-isel] Add support for ARM PIC.Jush Lu2012-08-291-6/+16
| | | | llvm-svn: 162823
* Fix ARM vector copies of overlapping register tuples.Andrew Trick2012-08-291-0/+13
| | | | | | | | I have tested the fix, but have not been successfull in generating a robust unit test. This can only be exposed through particular register assignments. llvm-svn: 162821
* cleanupAndrew Trick2012-08-291-21/+19
| | | | llvm-svn: 162820
* Verify the tied operand flags.Jakob Stoklund Olesen2012-08-291-0/+37
| | | | | | | | WHen running with -verify-machineinstrs, check that tied operands come in matching use/def pairs, and that they are consistent with MCInstrDesc when it applies. llvm-svn: 162816
* Maintain a vaild isTied bit as operands are added and removed.Jakob Stoklund Olesen2012-08-291-1/+39
| | | | | | | | The isTied bit is set automatically when a tied use is added and MCInstrDesc indicates a tied operand. The tie is broken when one of the tied operands is removed. llvm-svn: 162814
* Typo.Chad Rosier2012-08-281-1/+1
| | | | llvm-svn: 162807
* Add comments on the literal value used.Michael Liao2012-08-281-1/+1
| | | | llvm-svn: 162805
* Profile: set branch weight metadata with data generated from profiling.Manman Ren2012-08-285-26/+377
| | | | | | | | | This patch implements ProfileDataLoader which loads profile data generated by -insert-edge-profiling and updates branch weight metadata accordingly. Patch by Alastair Murray. llvm-svn: 162799
* The instruction DEXT may be transformed into DEXTU or DEXTM dependingJack Carter2012-08-284-3/+54
| | | | | | | | | | | | | | | | | | | | | on the size of the extraction and its position in the 64 bit word. This patch allows support of the dext transformations with mips64 direct object output. 0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32 DINS The field is entirely contained in the right-most word of the doubleword 32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64 DINSM The field straddles the words of the doubleword 32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32 DINSU The field is entirely contained in the left-most word of the doubleword llvm-svn: 162782
* Explicitly update the number of nodes to be traversedMichael Liao2012-08-281-1/+1
| | | | llvm-svn: 162780
OpenPOWER on IntegriCloud