summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* x86: Rename NumBytesForCalleeToPush to ...Pop for accuracyReid Kleckner2014-01-311-5/+5
| | | | | | | If we have a callee cleanup convention, the callee is going to pop the arguments off the stack, not push them on. llvm-svn: 200566
* [ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret'Reid Kleckner2014-01-316-1/+42
| | | | | | | | | | | | | | | | | | | | MSVC always places the 'this' parameter for a method first. The implicit 'sret' pointer for methods always comes second. We already implement this for __thiscall by putting sret parameters on the stack, but __cdecl methods require putting both parameters on the stack in opposite order. Using a special calling convention allows frontends to keep the sret parameter first, which avoids breaking lots of assumptions in LLVM and Clang. Fixes PR15768 with the corresponding change in Clang. Reviewers: ributzka, majnemer Differential Revision: http://llvm-reviews.chandlerc.com/D2663 llvm-svn: 200561
* [mips][msa] Add insert.d instruction.Matheus Almeida2014-01-312-0/+19
| | | | | | This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200543
* [vectorizer] Tweak the way we do small loop runtime unrolling in theChandler Carruth2014-01-311-15/+22
| | | | | | | | | | | | | | loop vectorizer to not do so when runtime pointer checks are needed and share code with the new (not yet enabled) load/store saturation runtime unrolling. Also ensure that we only consider the runtime checks when the loop hasn't already been vectorized. If it has, the runtime check cost has already been paid. I've fleshed out a test case to cover the scalar unrolling as well as the vector unrolling and comment clearly why we are or aren't following the pattern. llvm-svn: 200530
* Separate x86 opcode maps and 0x66/0xf2/0xf3 prefixes from each other in the ↵Craig Topper2014-01-314-288/+215
| | | | | | TSFlags. This greatly simplifies the switch statements in the disassembler tables and the code emitters. llvm-svn: 200522
* Move REP out of the Prefix field of the X86 format. Give it its own bit. It ↵Craig Topper2014-01-314-35/+35
| | | | | | had special handling anyway and this enables a future patch. llvm-svn: 200520
* Move address override handling in X86CodeEmitter to a place where it works ↵Craig Topper2014-01-311-28/+28
| | | | | | for VEX encoded instructions too. This allows 32-bit addressing to work in 64-bit mode. llvm-svn: 200517
* Move address override handling in X86MCCodeEmitter to a place where it works ↵Craig Topper2014-01-311-46/+43
| | | | | | for VEX encoded instructions too. This allows 32-bit addressing to work in 64-bit mode. llvm-svn: 200516
* Fix a bug in gcov instrumentation introduced by r195513. <rdar://15930350>Bob Wilson2014-01-311-1/+8
| | | | | | | | | | The entry block of a function starts with all the static allocas. The change in r195513 splits the block before those allocas, which has the effect of turning them into dynamic allocas. That breaks all sorts of things. Change to split after the initial allocas, and also add a comment explaining why the block is split. llvm-svn: 200515
* [Sparc] Save and restore float registers that may be used for parameter passing.Venkatraman Govindaraju2014-01-311-2/+44
| | | | llvm-svn: 200509
* This patch teaches the DAGCombiner how to fold insert_subvector nodesManman Ren2014-01-311-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when the input is a concat_vectors and the insert replaces one of the concat halves: Lower half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors Z, Y) Upper half: fold (insert_subvector (concat_vectors X, Y), Z) -> (concat_vectors X, Z) This can be seen with the following IR: define <8 x float> @lower_half(<4 x float> %v1, <4 x float> %v2, <4 x float> %v3) { %1 = shufflevector <4 x float> %v1, <4 x float> %v2, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7> %2 = tail call <8 x float> @llvm.x86.avx.vinsertf128.ps.256(<8 x float> %1, <4 x float> %v3, i8 0) The vinsertf128 intrinsic is converted into an insert_subvector node in SelectionDAGBuilder.cpp. Using AVX, without the patch this generates two vinsertf128 instructions: vinsertf128 $1, %xmm1, %ymm0, %ymm0 vinsertf128 $0, %xmm2, %ymm0, %ymm0 With the patch this is optimized into: vinsertf128 $1, %xmm1, %ymm2, %ymm0 Patch by Robert Lougher. llvm-svn: 200506
* DAGCombine should not produce ISD::OR nodes after operation legalization if ↵Owen Anderson2014-01-311-2/+4
| | | | | | they're not legal. llvm-svn: 200503
* PGO branch weight: update edge weights in SelectionDAGBuilder.Manman Ren2014-01-312-12/+65
| | | | | | | | | | | | | | | | When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. The previous attempt at r200431 was reverted at r200434 because of two testing case failures. I modified my patch a little, but forgot to re-run "make check-all". Testing case CodeGen/ARM/lsr-unfolded-offset.ll is updated because of the patch's impact on branch probability which causes changes in spill placement. llvm-svn: 200502
* Allow speculating llvm.sqrt, fma and fmuladdMatt Arsenault2014-01-311-0/+6
| | | | | | | | This doesn't set errno, so this should be OK. Also update the documentation to explicitly state that errno are not set. llvm-svn: 200501
* [x86] Fix signed relocations for i64i32imm operandsDavid Woodhouse2014-01-305-55/+84
| | | | | | | | | These should end up (in ELF) as R_X86_64_32S relocs, not R_X86_64_32. Kill the horrid and incomplete special case and FIXME in EncodeInstruction() and set things up so it can infer the signedness from the ImmType just like it can the size and whether it's PC-relative. llvm-svn: 200495
* [AArch64] Custom lower concat_vector patterns with v4i16, v4i32, v8i8, ↵Chad Rosier2014-01-302-0/+67
| | | | | | v8i16, v16i8 types. llvm-svn: 200491
* Fix PR18381 - print a minimal diagnostic rather than assert on unresolved ↵Timur Iskhodzhanov2014-01-301-0/+5
| | | | | | .secidx target llvm-svn: 200490
* Only ELF has a dynamic symbol table. Remove it from ObjectFile.Rafael Espindola2014-01-302-20/+0
| | | | | | | | | COFF has only one symbol table. MachO has a LC_DYSYMTAB, but that is not a symbol table, just extra info about the one symbol table (LC_SYMTAB). IR (coming soon) also has only one table. llvm-svn: 200488
* [Stackmaps] Record the stack size of each function that contains a ↵Juergen Ributzka2014-01-301-0/+24
| | | | | | | | | | stackmap/patchpoint intrinsic. Re-applying the patch, but this time without using AsmPrinter methods. Reviewed by Andy llvm-svn: 200481
* Reenable ARM EHABI on Android.Evgeniy Stepanov2014-01-301-1/+2
| | | | | | Broken in r200388. llvm-svn: 200466
* [mips] Fix typo.Matheus Almeida2014-01-301-1/+1
| | | | llvm-svn: 200465
* Remove duplicate patternsCraig Topper2014-01-301-5/+0
| | | | llvm-svn: 200461
* Remove some AddedComplexity tags that were forcing priority for AVX over ↵Craig Topper2014-01-301-36/+36
| | | | | | SSE. Use predicates instead. llvm-svn: 200458
* Remove duplicate pattern and add predicate checks on other patterns.Craig Topper2014-01-301-1/+2
| | | | llvm-svn: 200455
* Implement SPARCv9 atomic_swap_64 with a pseudo.Jakob Stoklund Olesen2014-01-302-3/+15
| | | | | | | | The SWAP instruction only exists in a 32-bit variant, but the 64-bit atomic swap can be implemented in terms of CASX, like the other atomic rmw primitives. llvm-svn: 200453
* ARM IAS: support .object_archSaleem Abdulrasool2014-01-302-3/+59
| | | | | | | | | | The .object_arch directive indicates an alternative architecture to be specified in the object file. The directive does *not* effect the enabled feature bits for the object file generation. This is particularly useful when the code performs runtime detection and would like to indicate a lower architecture as the requirements than the actual instructions used. llvm-svn: 200451
* ARM IAS: support .movspSaleem Abdulrasool2014-01-303-6/+102
| | | | | | | | .movsp is an ARM unwinding directive that indicates to the unwinder that a register contains an offset from the current stack pointer. If the offset is unspecified, it defaults to zero. llvm-svn: 200449
* ARM: suuport .tlsdescseq directiveSaleem Abdulrasool2014-01-304-0/+50
| | | | | | | | | | | This enhances the ARMAsmParser to handle .tlsdescseq directives. This is a slightly special relocation. We must be able to generate them, but not consume them in assembly. The relocation is meant to assist the linker in generating a TLS descriptor sequence. The ELF target streamer is enhanced to append additional fixups into the current segment and that is used to emit the new R_ARM_TLS_DESCSEQ relocations. llvm-svn: 200448
* ARM: support TLS descriptor relocationsSaleem Abdulrasool2014-01-302-0/+6
| | | | | | | | Add support for tlsdesc relocations which are part of the ABI, marked as experimental. These relocations permit the linker to perform TLS reference optimizations. llvm-svn: 200447
* ARM: support tlscall relocationsSaleem Abdulrasool2014-01-303-1/+23
| | | | | | | | | | | | | | This adds support for TLS CALL relocations. TLS CALL relocations are used to indicate to the linker to generate appropriate entries to resolve TLS references via an appropriate function invocation (e.g. __tls_get_addr(PLT)). In order to accomodate the linker relaxation of the TLS access model for the references (GD/LD -> IE, IE -> LE), the relocation addend must be incomplete. This requires that the partial inplace value is also incomplete (i.e. 0). We simply avoid the offset value calculation at the time of the fixup adjustment in the ARM assembler backend. llvm-svn: 200446
* Revert "[Stackmaps] Record the stack size of each function that contains a ↵Juergen Ributzka2014-01-301-24/+0
| | | | | | | | stackmap/patchpoint intrinsic." This reverts commit r200444 to unbreak buildbots. llvm-svn: 200445
* [Stackmaps] Record the stack size of each function that contains a ↵Juergen Ributzka2014-01-301-0/+24
| | | | | | | | stackmap/patchpoint intrinsic. Reviewed by Andy llvm-svn: 200444
* Simplify the handling of iterators in ObjectFile.Rafael Espindola2014-01-309-142/+58
| | | | | | | | | | | | None of the object file formats reported error on iterator increment. In retrospect, that is not too surprising: no object format stores symbols or sections in a linked list or other structure that requires chasing pointers. As a consequence, all error checking can be done on begin() and end(). This reduces the text segment of bin/llvm-readobj in my machine from 521233 to 518526 bytes. llvm-svn: 200442
* Reland r200340 - 'Add line table debug info to COFF files when using a win32 ↵Timur Iskhodzhanov2014-01-307-28/+514
| | | | | | | | triple' This incorporates a couple of fixes reviewed at http://llvm-reviews.chandlerc.com/D2651 llvm-svn: 200440
* Revert r200431 due to bot failures.Manman Ren2014-01-302-65/+12
| | | | llvm-svn: 200434
* PGO branch weight: update edge weights in SelectionDAGBuilder.Manman Ren2014-01-302-12/+65
| | | | | | | | When converting from "or + br" to two branches, or converting from "and + br" to two branches, we correctly update the edge weights of the two branches. llvm-svn: 200431
* PGO branch weight: update edge weights in IfConverter.Manman Ren2014-01-292-0/+54
| | | | | | | | | | | | | This commit only handles IfConvertTriangle. To update edge weights of a successor, one interface is added to MachineBasicBlock: /// Set successor weight of a given iterator. setSuccWeight(succ_iterator I, uint32_t weight) An existing testing case test/CodeGen/Thumb2/v8_IT_5.ll is updated, since we now correctly update the edge weights, the cold block is placed at the end of the function and we jump to the cold block. llvm-svn: 200428
* Move range handling for a function to endFunction rather thanEric Christopher2014-01-291-5/+5
| | | | | | when we create the subprogram DIE. llvm-svn: 200426
* If we use DW_AT_ranges we need to specify a base address that rangesEric Christopher2014-01-291-2/+8
| | | | | | | | are relative to in the compile unit. Currently let's just use 0... Thanks to Greg Clayton for the catch! llvm-svn: 200425
* Turn on CU ranges if we've got multiple compile units in the sameEric Christopher2014-01-291-4/+6
| | | | | | | | module since there's no range guarantee that we could make given output order. This also fixes up the testcases that have multiple CUs to have the correct range offset. llvm-svn: 200422
* Make the compile unit map a MapVector so that we can assume a stableEric Christopher2014-01-292-3/+5
| | | | | | output ordering. llvm-svn: 200421
* Fix formatting of comment.Eric Christopher2014-01-291-4/+2
| | | | llvm-svn: 200420
* MC: Better management of macro argumentsDavid Majnemer2014-01-291-55/+22
| | | | | | | | | | | | The linux kernel makes uses of a GAS `feature' which substitutes nothing for macro arguments which aren't specified. Proper support for these kind of macro arguments necessitated a cleanup of differences between `GAS' and `Darwin' dialect macro processing. Differential Revision: http://llvm-reviews.chandlerc.com/D2634 llvm-svn: 200409
* [CommandLine] Aliases require an value if their target requires a value.Jordan Rose2014-01-291-0/+7
| | | | | | | | | This can still be overridden by explicitly setting a value requirement on the alias option, but by default it should be the same. PR18649 llvm-svn: 200407
* Add support for PC-relative non-extern relocations to RuntimeDyldMachO.Lang Hames2014-01-291-0/+2
| | | | | | | | | Also replaces testcase for r180790 (support for absolute non-externs relocs) with a more robust version. <rdar://problem/15864721> llvm-svn: 200404
* [X86][SchedModel] Fix typos in the definitions of the ports for Haswell.Quentin Colombet2014-01-291-6/+8
| | | | llvm-svn: 200403
* Test commitOliver Stannard2014-01-291-0/+1
| | | | llvm-svn: 200401
* [mips][msa] Add fill.d instruction.Matheus Almeida2014-01-292-1/+16
| | | | | | | This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200400
* [mips][msa] Add copy_{u,s}.d.Matheus Almeida2014-01-293-14/+52
| | | | | | | These instructions are only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200398
* [LPM] Fix PR18643, another scary place where loop transforms failed toChandler Carruth2014-01-292-49/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | preserve loop simplify of enclosing loops. The problem here starts with LoopRotation which ends up cloning code out of the latch into the new preheader it is buidling. This can create a new edge from the preheader into the exit block of the loop which breaks LoopSimplify form. The code tries to fix this by splitting the critical edge between the latch and the exit block to get a new exit block that only the latch dominates. This sadly isn't sufficient. The exit block may be an exit block for multiple nested loops. When we clone an edge from the latch of the inner loop to the new preheader being built in the outer loop, we create an exiting edge from the outer loop to this exit block. Despite breaking the LoopSimplify form for the inner loop, this is fine for the outer loop. However, when we split the edge from the inner loop to the exit block, we create a new block which is in neither the inner nor outer loop as the new exit block. This is a predecessor to the old exit block, and so the split itself takes the outer loop out of LoopSimplify form. We need to split every edge entering the exit block from inside a loop nested more deeply than the exit block in order to preserve all of the loop simplify constraints. Once we try to do that, a problem with splitting critical edges surfaces. Previously, we tried a very brute force to update LoopSimplify form by re-computing it for all exit blocks. We don't need to do this, and doing this much will sometimes but not always overlap with the LoopRotate bug fix. Instead, the code needs to specifically handle the cases which can start to violate LoopSimplify -- they aren't that common. We need to see if the destination of the split edge was a loop exit block in simplified form for the loop of the source of the edge. For this to be true, all the predecessors need to be in the exact same loop as the source of the edge being split. If the dest block was originally in this form, we have to split all of the deges back into this loop to recover it. The old mechanism of doing this was conservatively correct because at least *one* of the exiting blocks it rewrote was the DestBB and so the DestBB's predecessors were fixed. But this is a much more targeted way of doing it. Making it targeted is important, because ballooning the set of edges touched prevents LoopRotate from being able to split edges *it* needs to split to preserve loop simplify in a coherent way -- the critical edge splitting would sometimes find the other edges in need of splitting but not others. Many, *many* thanks for help from Nick reducing these test cases mightily. And helping lots with the analysis here as this one was quite tricky to track down. llvm-svn: 200393
OpenPOWER on IntegriCloud