summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [x86] lower calls to llvm.maxnum.v4f32 using maxpsSanjay Patel2015-12-261-7/+10
| | | | | | | This is a follow-on to: http://reviews.llvm.org/rL255700 llvm-svn: 256454
* [X86] Fix an unused variable warning in released builds.Craig Topper2015-12-261-0/+2
| | | | llvm-svn: 256453
* [X86] Add support for printing shuffle comments for AVX512 PSHUFB instructions.Craig Topper2015-12-262-12/+39
| | | | llvm-svn: 256452
* [X86] Fold some variable declarations and initializations into if ↵Craig Topper2015-12-261-6/+3
| | | | | | statements. NFC llvm-svn: 256451
* [gc.statepoint] Change gc.statepoint intrinsic's return type to token type ↵Chen Li2015-12-262-12/+6
| | | | | | | | | | | | | | instead of i32 type Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob Subscribers: reames, mjacob, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15662 llvm-svn: 256443
* [X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the ↵Craig Topper2015-12-263-34/+64
| | | | | | Constant type not matching due to folding in the constant pool and to get VPERMILPD correct. llvm-svn: 256433
* [X86] Fix copy and paste typo from pasting from another Makefile to restore ↵Craig Topper2015-12-251-1/+1
| | | | | | code. llvm-svn: 256431
* [X86] Put back the include path to the main X86 sources in the AsmParser ↵Craig Topper2015-12-251-0/+3
| | | | | | library to fix the bots. llvm-svn: 256430
* [X86] Remove X86CodeGen dependency from the AsmParser library.Craig Topper2015-12-252-4/+1
| | | | llvm-svn: 256429
* [X86] Move getX86SubSuperRegisterOrZero to X86MCTargetDesc.cpp so it can be ↵Craig Topper2015-12-255-193/+193
| | | | | | used by AsmParser library without depending on X86CodeGen library. llvm-svn: 256428
* Remove extra forward declarations and scrub includes for all in tree ↵Craig Topper2015-12-2518-31/+4
| | | | | | InstPrinters. NFC llvm-svn: 256427
* [X86] Move AVX512 STATIC_ROUNDING enum to X86BaseInfo.h to fix a layering ↵Craig Topper2015-12-253-10/+10
| | | | | | violation in AsmParser. llvm-svn: 256426
* [X86] Replace MVT::SimpleValueType in the AsmParser library and ↵Craig Topper2015-12-257-118/+108
| | | | | | | | getX86SubSuperRegister with just an unsigned representing size. This a is step towards fixing a layering violation so the X86 AsmParser won't depending on CodeGen types. llvm-svn: 256425
* [X86] Don't pass the default value to the High argument of ↵Craig Topper2015-12-252-8/+5
| | | | | | getX86SubSuperRegister. Most place don't care about this argument. NFC llvm-svn: 256424
* [X86] getX86SubSuperRegisterOrZero shouldn't call getX86SubSuperRegister ↵Craig Topper2015-12-251-1/+1
| | | | | | recursively. It should call itself instead. Otherwise it might fire an assertion when it was designed not too. llvm-svn: 256422
* [X86] Add missing X86II::MRM_C4, MRM_C5, etc. encodings to ↵Craig Topper2015-12-251-15/+19
| | | | | | getMemoryOperandNo. These aren't used by any instructions, but could be someday. NFC llvm-svn: 256421
* [X86] Use assert instead of if and llvm_unreachable. NFCCraig Topper2015-12-251-2/+1
| | | | llvm-svn: 256420
* [X86] Minor identation fixes. NFCCraig Topper2015-12-251-2/+2
| | | | llvm-svn: 256419
* [CodeGen] Use generic printAsOperand machinery instead of hand rolling itDavid Majnemer2015-12-251-3/+7
| | | | | | | | We already know how to properly print out basic blocks in printAsOperand, we should not roll it ourselves in AsmPrinter::EmitBasicBlockStart. No functionality change is intended. llvm-svn: 256413
* [IR] Mark the Type subclass helper methods 'inline' and move their ↵Craig Topper2015-12-251-49/+0
| | | | | | definitions to DerivedTypes.h so they can be inlined by the compiler. llvm-svn: 256406
* [Transforms] Use asserts instead of ifs around llvm_unreachable. NFCCraig Topper2015-12-251-34/+20
| | | | llvm-svn: 256405
* [WebAssembly] Fix handling of COPY instructions in WebAssemblyRegStackify.Dan Gohman2015-12-254-58/+64
| | | | | | | | | | | | | Move RegStackify after coalescing and teach it to use LiveIntervals instead of depending on SSA form. This avoids a problem where a register in a COPY instruction is stackified and then subsequently coalesced with a register that is not stackified. This also puts it after the scheduler, which allows us to simplify the EXPR_STACK constraint, as we no longer have instructions being reordered after stackification and before coloring. llvm-svn: 256402
* [InstCombine] transform more extract/insert pairs into shuffles (PR2109)Sanjay Patel2015-12-241-3/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an extension of the shuffle combining from r203229: http://reviews.llvm.org/rL203229 The idea is to widen a short input vector with undef elements so the existing shuffle transform for extract/insert can kick in. The motivation is to finally solve PR2109: https://llvm.org/bugs/show_bug.cgi?id=2109 For that example, the IR becomes: %1 = bitcast <2 x i32>* %P to <2 x float>* %ld1 = load <2 x float>, <2 x float>* %1, align 8 %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5> ret <4 x float> %i2 And x86 SSE output improves from: movq (%rdi), %xmm1 ## xmm1 = mem[0],zero movdqa %xmm1, %xmm2 shufps $229, %xmm2, %xmm2 ## xmm2 = xmm2[1,1,2,3] shufps $48, %xmm0, %xmm1 ## xmm1 = xmm1[0,0],xmm0[3,0] shufps $132, %xmm1, %xmm0 ## xmm0 = xmm0[0,1],xmm1[0,2] shufps $32, %xmm0, %xmm2 ## xmm2 = xmm2[0,0],xmm0[2,0] shufps $36, %xmm2, %xmm0 ## xmm0 = xmm0[0,1],xmm2[2,0] retq To the almost optimal: movhpd (%rdi), %xmm0 Note: There's a tension in the existing transform related to generating arbitrary shufflevector masks. We avoid that in other places in InstCombine because we're scared that codegen can't handle strange masks, but it looks like we're ok with producing those here. I purposely chose weird insert/extract indexes for the regression tests to see the effect in these cases. For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or better for these examples. Differential Revision: http://reviews.llvm.org/D15096 llvm-svn: 256394
* Remove unused constants from TypeTableBuilder.cpp.Dave Bartolomeo2015-12-241-4/+0
| | | | llvm-svn: 256389
* Fix case of path nameBill Seurer2015-12-241-1/+1
| | | | llvm-svn: 256388
* Fix CodeView library name and non-CMake buildsDave Bartolomeo2015-12-243-4/+21
| | | | llvm-svn: 256387
* LLVM CodeView libraryDave Bartolomeo2015-12-2412-2/+670
| | | | | | | | | | | | | | | | | | Summary: This diff is the initial implementation of the LLVM CodeView library. There is much more work to be done, namely a CodeView dumper and tests. This patch should help others make progress on the LLVM->CodeView debug info emission while I continue with the implementation of the dumper and tests. This library implements support for emitting debug info in the CodeView format. This phase of the implementation only includes support for CodeView type records. Clients that need to emit type records will use a class derived from TypeTableBuilder. TypeTableBuilder provides member functions for writing each kind of type record; each of these functions eventually calls the writeRecord virtual function to emit the actual bits of the record. Derived classes override writeRecord to implement the folding of duplicate records and the actual emission to the appropriate destination. LLVMCodeView provides MemoryTypeTableBuilder, which creates the table in memory. In the future, other classes derived from TypeTableBuilder will write to other destinations, such as the type stream in a PDB. The rest of the types in LLVMCodeView define the actual CodeView type records and all of the supporting enums and other types used in the type records. The TypeIndex class is of particular interest, because it is used by clients as a handle to a type in the type table. The library provides a relatively low-level interface based on the actual on-disk format of CodeView. For example, type records refer to other type records by TypeIndex, rather than by an actual pointer to the referent record. This allows clients to emit type records one at a time, rather than having to keep the entire transitive closure of type records in memory until everything has been emitted. At some point, having a higher-level interface layered on top of this one may be useful for debuggers and other tools that want a more holistic view of the debug info. The lower-level interface should be sufficient for compilers and linkers to do the debug info manipulation that they need to do efficiently. Reviewers: rnk, majnemer Subscribers: silvas, rnk, jevinskie, llvm-commits Differential Revision: http://reviews.llvm.org/D14961 llvm-svn: 256385
* [X86][ms-inline asm] Add support for memory operands that include structsMarina Yatsina2015-12-241-1/+5
| | | | | | | | | Add ability to reference struct symbols in memory operands. Test case will be added on the clang side (review http://reviews.llvm.org/D15749) Differential Revision: http://reviews.llvm.org/D15748 llvm-svn: 256381
* [ProfileData] Make helper function static.Benjamin Kramer2015-12-241-1/+1
| | | | | | No functional change. llvm-svn: 256375
* [FunctionImport] Move pass into anonymous namespace.Benjamin Kramer2015-12-241-0/+2
| | | | | | No functional change. llvm-svn: 256374
* Add a missing const qualifier on the context instruction. This somehowChandler Carruth2015-12-241-1/+1
| | | | | | has always been missing. =/ llvm-svn: 256371
* [X86][PKU] Add {RD,WR}PKRU encodingAsaf Badouh2015-12-242-6/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D15711 llvm-svn: 256366
* AVX-512: Kreg set 0/1 optimizationElena Demikhovsky2015-12-241-6/+28
| | | | | | | | | | | | | | | | | The patterns that set a mask register to 0/1 KXOR %kn, %kn, %kn / KXNOR %kn, %kn, %kn are replaced with KXOR %k0, %k0, %kn / KXNOR %k0, %k0, %kn - AVX-512 targets optimization. KNL does not recognize dependency-breaking idioms for mask registers, so kxnor %k1, %k1, %k2 has a RAW dependence on %k1. Using %k0 as the undef input register is a performance heuristic based on the assumption that %k0 is used less frequently than the other mask registers, since it is not usable as a write mask. Differential Revision: http://reviews.llvm.org/D15739 llvm-svn: 256365
* AVX512: VPMOVM2B/W/D/Q intrinsic implementation.Igor Breger2015-12-242-14/+32
| | | | | | Differential Revision: http://reviews.llvm.org//D15747 llvm-svn: 256364
* Use range-based for loops. NFCCraig Topper2015-12-241-40/+32
| | | | llvm-svn: 256363
* AMDGPU: Fix getRegisterBitWidth for vectorsMatt Arsenault2015-12-241-1/+3
| | | | llvm-svn: 256362
* Revert r256336, it caused PR25939Nico Weber2015-12-241-113/+61
| | | | llvm-svn: 256361
* AMDGPU/SI: Fix encoding of flat instructions on VITom Stellard2015-12-243-93/+236
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15735 llvm-svn: 256360
* AMDGPU/SI: Remove non-existent flat instructionsTom Stellard2015-12-241-2/+0
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15734 llvm-svn: 256357
* [Statepoints] Use Indirect operands for spill slotsPhilip Reames2015-12-232-5/+35
| | | | | | | | | | Teach the statepoint lowering code to emit Indirect stackmap entries for spill inserted by StatepointLowering (i.e. SelectionDAG), but Direct stackmap entries for in-IR allocas which represent manual stack slots. This is what the docs call for (http://llvm.org/docs/StackMaps.html#stack-map-format), but we've been emitting both as Direct. This was pointed out recently on the mailing list as a bug. It also blocks http://reviews.llvm.org/D15632 which extends the lowering to handle vector-of-pointers since only Indirect references can encode a variable sized slot. To implement this, I introduced a new flag on the StackObject class used to maintian information about stack slots. I original considered (and prototyped in http://reviews.llvm.org/D15632), the idea of using the existing isSpillSlot flag, but end up deciding that was a bit too risky and that the cost of adding a new flag was low. Having the new flag will also allow us - in the future - to emit better comments in verbose assembly which indicate where a particular stack spill around a call comes from. (deopt, gc, regalloc). Differential Revision: http://reviews.llvm.org/D15759 llvm-svn: 256352
* [MemOperands] Clarify code around dropping memory operands [NFC]Philip Reames2015-12-231-1/+1
| | | | | | Clarify a comment about what it means to drop memory operands from an instruction. While I'm adding change the name of the method slightly to make it a bit more clear what's going on when reading calling code. llvm-svn: 256346
* [Function] Properly remove use when clearing personalityKeno Fischer2015-12-231-9/+10
| | | | | | | | | | | | | | | | | Summary: We need to actually remove the use of the personality function, otherwise we can run into trouble if we want to e.g. delete the personality function because ther's no way to get rid of its uses. Do this by resetting to ConstantPointerNull value that the operands are set to when first allocated. Reviewers: vsk, dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15752 llvm-svn: 256345
* Fix SCEV r256338.JF Bastien2015-12-231-2/+2
| | | | llvm-svn: 256344
* [SCEV] Fix getLoopBackedgeTakenCountsSanjoy Das2015-12-231-17/+16
| | | | | | | | The way `getLoopBackedgeTakenCounts` is written right now isn't correct. It will try to compute and store the BE counts of a Loop #{child loop} number of times (which may be zero). llvm-svn: 256338
* [LIR] General refactoring to simplify code and the ease future code review.Chad Rosier2015-12-231-61/+113
| | | | | | | | | | Move several checks into isLegalStores. Also, delineate between those stores that are memset-able and those that are memcpy-able. http://reviews.llvm.org/D15683 Patch by Haicheng Wu <haicheng@codeaurora.org>! llvm-svn: 256336
* [MachineLICM] Fix handling of memoperandsPhilip Reames2015-12-231-2/+12
| | | | | | | | | | As far as I can tell, the correct interpretation of an empty memoperands list is that we didn't have sufficient room to store information about the MachineInstr, NOT that the MachineInstr doesn't access any particular bit of memory. This appears to be fairly consistent in a number of places, but I'm not 100% sure of this interpretation. I'd really appreciate someone more knowledgeable confirming my reading of the code. This patch fixes two latent bugs in MachineLICM - given the above assumption - and adds comments to document the meaning and required handling. I don't have test cases; these were noticed by inspection. Differential Revision: http://reviews.llvm.org/D15730 llvm-svn: 256335
* [X86][AVX] Only shuffle the lower half of vectors if the upper half is undefinedSimon Pilgrim2015-12-231-52/+110
| | | | | | | | | | | | | | First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151. As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops. Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well. Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions). Differential Revision: http://reviews.llvm.org/D15477 llvm-svn: 256332
* [OperandBundles] Have GlobalsModRef play nice with operand bundlesDavid Majnemer2015-12-232-7/+6
| | | | | | | A call site's use of a Value might not correspond to an argument operand but to a bundle operand. llvm-svn: 256329
* [OperandBundles] Have TailCallElim play nice with operand bundlesDavid Majnemer2015-12-231-2/+2
| | | | | | | | | A call site's use of a Value might not correspond to an argument operand but to a bundle operand. This fixes PR25928. llvm-svn: 256328
* [OperandBundles] Have InstCombine play nice with operand bundlesDavid Majnemer2015-12-231-4/+6
| | | | | | | Don't assume a call's use corresponds to an argument operand, it might correspond to a bundle operand. llvm-svn: 256327
OpenPOWER on IntegriCloud