summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* When a block ends in an indirect branch, add its successors to the machine ↵Bill Wendling2012-10-221-0/+5
| | | | | | | | | | basic block. The CFG of the machine function needs to know that the targets of the indirect branch are successors to the indirect branch. <rdar://problem/12529625> llvm-svn: 166448
* Add support for annotated disassembly output for X86 and arm.Kevin Enderby2012-10-222-130/+516
| | | | | | | | | | | Per the October 12, 2012 Proposal for annotated disassembly output sent out by Jim Grosbach this set of changes implements this for X86 and arm. The llvm-mc tool now has a -mdis option to produced the marked up disassembly and a couple of small example test cases have been added. rdar://11764962 llvm-svn: 166445
* [ms-inline asm] Add the isOffsetOf() function.Chad Rosier2012-10-221-0/+5
| | | | | | Part of rdar://12470317 llvm-svn: 166436
* [ms-inline asm] Add support for parsing the offset operator. Callback for Chad Rosier2012-10-221-5/+21
| | | | | | | CodeGen in the front-end not implemented yet. rdar://12470317 llvm-svn: 166433
* [ms-inline asm] Reset the opcode prior to parsing a statement.Chad Rosier2012-10-191-3/+0
| | | | llvm-svn: 166349
* [mips] Use 64-bit registers to return an sret pointer if target ABI is N64.Akira Hatanaka2012-10-191-2/+4
| | | | llvm-svn: 166344
* [mips] Add code to do tail call optimization.Akira Hatanaka2012-10-192-5/+44
| | | | | | | Currently, it is enabled only if option "enable-mips-tail-calls" is given and all of the callee's arguments are passed in registers. llvm-svn: 166342
* [mips] Fix TAILCALL's operand node type.Akira Hatanaka2012-10-191-5/+11
| | | | llvm-svn: 166341
* [mips] Delete MipsFunctionInfo::MaxCallFrameSize which is no longer used.Akira Hatanaka2012-10-192-10/+1
| | | | llvm-svn: 166339
* [mips] Add tail call instructions.Akira Hatanaka2012-10-192-0/+12
| | | | llvm-svn: 166338
* [mips] Make the branch nodes used in jump instructions a template parameter.Akira Hatanaka2012-10-191-10/+21
| | | | llvm-svn: 166337
* Add node and enum for mips tail call.Akira Hatanaka2012-10-193-0/+8
| | | | llvm-svn: 166318
* [ms-inline asm] Have the TargetParser callback to Sema to determine the size ofChad Rosier2012-10-191-3/+30
| | | | | | | a memory operand. Retain this information and then add the sizing directives to the IR. This allows the backend to do proper instruction selection. llvm-svn: 166316
* This patch is to fix radar://8426430. It is about llvm support of ↵Shuxin Yang2012-10-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | __builtin_debugtrap() which is supposed to consistently raise SIGTRAP across all systems. In contrast, __builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap" functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap(). The X86 backend is already able to handle debugtrap(). This patch is to: 1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang). 2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which make the __builtin_debugtrap() "available" to all existing ports without the hassle of changing their code. 3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and __builtin_trap() will be expanded into the function call of the specified trap function. This behavior may need change in the future. The provided testing-case is to make sure 2) and 3) are working for ARM port, and we already have a testing case for x86. llvm-svn: 166300
* Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86Michael Liao2012-10-192-1/+81
| | | | | | | | | - If INSERT_VECTOR_ELT is supported (above SSE2, either by custom sequence of legal insn), transform BUILD_VECTOR into SHUFFLE + INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few (so far 1) elements being inserted. llvm-svn: 166288
* ARM:Stepan Dyatkovskiy2012-10-192-12/+20
| | | | | | | | Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. llvm-svn: 166273
* Reapply the TargerTransformInfo changes, minus the changes to LSR and ↵Nadav Rotem2012-10-1825-16/+183
| | | | | | Lowerinvoke. llvm-svn: 166248
* Fix a bug where a 32-bit address with the high bit does not get symbolicatedKevin Enderby2012-10-181-2/+3
| | | | | | | because the value is incorrectly being signed extended when passed to SymbolLookUp(). llvm-svn: 166234
* This patch fixes failures in the SingleSource/Regression/C/uint64_to_floatUlrich Weigand2012-10-181-1/+46
| | | | | | | | | | | | | | | | | | test case on PowerPC caused by rounding errors when converting from a 64-bit integer to a single-precision floating point. The reason for this are double-rounding effects, since on PowerPC we have to convert to an intermediate double-precision value first, which gets rounded to the final single-precision result. The patch fixes the problem by preparing the 64-bit integer so that the first conversion step to double-precision will always be exact, and the final rounding step will result in the correctly-rounded single-precision result. The generated code sequence is equivalent to what GCC would generate. When -enable-unsafe-fp-math is in effect, that extra effort is omitted and we accept possible rounding errors (just like GCC does as well). llvm-svn: 166178
* Temporarily revert the TargetTransform changes.Bob Wilson2012-10-1825-183/+16
| | | | | | | | | | | The TargetTransform changes are breaking LTO bootstraps of clang. I am working with Nadav to figure out the problem, but I am reverting it for now to get our buildbots working. This reverts svn commits: 165665 165669 165670 165786 165787 165997 and I have also reverted clang svn 165741 llvm-svn: 166168
* Add conditional branch instructions and their patterns.Reed Kotler2012-10-172-2/+296
| | | | llvm-svn: 166134
* Merge MRI::isPhysRegOrOverlapUsed() into isPhysRegUsed().Jakob Stoklund Olesen2012-10-171-2/+2
| | | | | | | | | | | All callers of these functions really want the isPhysRegOrOverlapUsed() functionality which also checks aliases. For historical reasons, targets without register aliases were calling isPhysRegUsed() instead. Change isPhysRegUsed() to also check aliases, and switch all isPhysRegOrOverlapUsed() callers to isPhysRegUsed(). llvm-svn: 166117
* Check for empty YMM use-def lists in X86VZeroUpper.Jakob Stoklund Olesen2012-10-171-1/+1
| | | | | | | | | | | The previous MRI.isPhysRegUsed(YMM0) would also return true when the function contains a call to a function that may clobber YMM0. That's most of them. Checking the use-def chains allows us to skip functions that don't explicitly mention YMM registers. llvm-svn: 166110
* Fix fallout from RegInfo => FrameLowering refactoring on MSP430.Anton Korobeynikov2012-10-174-16/+15
| | | | | | Patch by Job Noorman! llvm-svn: 166108
* Check SSSE3 instead of SSE4.1Michael Liao2012-10-171-2/+2
| | | | | | - All shuffle insns required, especially PSHUB, are added in SSSE3. llvm-svn: 166086
* Fix setjmp on models with non-Small code model nor non-Static relocation modelMichael Liao2012-10-172-22/+50
| | | | | | | | | | - MBB address is only valid as an immediate value in Small & Static code/relocation models. On other models, LEA is needed to load IP address of the restore MBB. - A minor fix of MBB in MC lowering is added as well to enable target relocation flag being propagated into MC. llvm-svn: 166084
* Support v8f32 to v8i8/vi816 conversion through custom loweringMichael Liao2012-10-162-17/+39
| | | | | | | | - Add custom FP_TO_SINT on v8i16 (and v8i8 which is legalized as v8i16 due to vector element-wise widening) to reduce DAG combiner and its overhead added in X86 backend. llvm-svn: 166036
* This patch addresses PR13949.Bill Schmidt2012-10-161-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8 bytes are to be passed in the low-order bits ("right-adjusted") of the doubleword register or memory slot assigned to them. A previous patch addressed this for aggregates passed in registers. However, small aggregates passed in the overflow portion of the parameter save area are still being passed left-adjusted. The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on the callee side. The main fix on the callee side simply extends existing logic for 1- and 2-byte objects to 1- through 7-byte objects, and correcting a constant left over from 32-bit code. There is also a fix to a bogus calculation of the offset to the following argument in the parameter save area. On the caller side, again a constant left over from 32-bit code is fixed. Additionally, some code for 1, 2, and 4-byte objects is duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only. The LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to handle both ABIs, and I propose to separate this into two functions in a future patch, at which time the duplication can be removed. The patch adds a new test (structsinmem.ll) to demonstrate correct passing of structures of all seven sizes. Eight dummy parameters are used to force these structures to be in the overflow portion of the parameter save area. As a side effect, this corrects the case when aggregates passed in registers are saved into the first eight doublewords of the parameter save area: Previously they were stored left-justified, and now are properly stored right-justified. This requires changing the expected output of existing test case structsinregs.ll. llvm-svn: 166022
* Issue:Stepan Dyatkovskiy2012-10-162-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018
* Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>.NAKAMURA Takumi2012-10-161-0/+41
| | | | | | | | | | | | | | | | | | | | | | | Original message: The attached is the fix to radar://11663049. The optimization can be outlined by following rules: (select (x != c), e, c) -> select (x != c), e, x), (select (x == c), c, e) -> select (x == c), x, e) where the <c> is an integer constant. The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; however, conditional-move-from-register need only one instruction. While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". Original message since r165661: My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code. llvm-svn: 166017
* Move X86MCInstLower class definition into implementation file. It's not ↵Craig Topper2012-10-163-54/+25
| | | | | | needed outside. llvm-svn: 166014
* Pass in the context to the Attributes::get method.Bill Wendling2012-10-161-1/+1
| | | | llvm-svn: 166007
* Add __builtin_setjmp/_longjmp supprt in X86 backendMichael Liao2012-10-157-1/+260
| | | | | | | | | | | - Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also used as a light-weight replacement of setjmp/longjmp which are used to implementation continuation, user-level threading, and etc. The support added in this patch ONLY addresses this usage and is NOT intended to support SjLj exception handling as zero-cost DWARF exception handling is used by default in X86. llvm-svn: 165989
* ARM: v1i64 and v2i64 VBSL intrinsic support.Jim Grosbach2012-10-151-0/+17
| | | | | | rdar://12502028 llvm-svn: 165981
* Move the Attributes::Builder outside of the Attributes class and into its ↵Bill Wendling2012-10-151-2/+2
| | | | | | own class named AttrBuilder. No functionality change. llvm-svn: 165960
* [ms-inline asm] If we parsed a statement and the opcode is valid, then it's ↵Chad Rosier2012-10-151-0/+3
| | | | | | an instruction. llvm-svn: 165955
* [ms-inline asm] Update the end loc for ParseIntelMemOperand.Chad Rosier2012-10-151-0/+1
| | | | llvm-svn: 165947
* [ms-inline asm] Use incoming argument rather than hard coding to false.Chad Rosier2012-10-151-1/+1
| | | | llvm-svn: 165945
* Resubmit the changes to llvm core to update the functions to support ↵Micah Villmow2012-10-1510-21/+31
| | | | | | different pointer sizes on a per address space basis. llvm-svn: 165941
* PowerPC: add EmitTCEntry class for TOC creationAdhemerval Zanella2012-10-151-2/+2
| | | | | | | This patch replaces the EmitRawText by a EmitTCEntry class (specialized for each Streamer) in PowerPC64 TOC entry creation. llvm-svn: 165940
* Fixed PR13938: the ARM backend was crashing because it couldn't select a ↵Silviu Baranga2012-10-151-2/+19
| | | | | | VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. llvm-svn: 165929
* Attributes RewriteBill Wendling2012-10-151-2/+3
| | | | | | | | | | Convert the internal representation of the Attributes class into a pointer to an opaque object that's uniqued by and stored in the LLVMContext object. The Attributes class then becomes a thin wrapper around this opaque object. Eventually, the internal representation will be expanded to include attributes that represent code generation options, etc. llvm-svn: 165917
* Don't pass in an Attributes object to something that expects an integral value.Bill Wendling2012-10-141-3/+1
| | | | llvm-svn: 165887
* X86: Disable long nops for all cpus prior to pentiumpro/i686.Benjamin Kramer2012-10-131-1/+3
| | | | llvm-svn: 165878
* X86: Fix accidentally swapped operands.Benjamin Kramer2012-10-131-1/+1
| | | | llvm-svn: 165871
* X86: Promote i8 cmov when both operands are coming from truncates of the ↵Benjamin Kramer2012-10-131-0/+15
| | | | | | | | | | | | | | same width. X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this level is often not a good idea because it's too late for many optimizations to kick in. This solution doesn't add any extensions (truncs are free) and tries to avoid introducing partial register stalls by filtering direct copyfromregs. I'm seeing a ~10% speedup on reading a random .png file with libpng15 via graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture. llvm-svn: 165868
* [ms-inline asm] Remove the MatchInstruction() function. Previously, this wasChad Rosier2012-10-134-41/+30
| | | | | | | | | | | the interface between the front-end and the MC layer when parsing inline assembly. Unfortunately, this is too deep into the parsing stack. Specifically, we're unable to handle target-independent assembly (i.e., assembly directives, labels, etc.). Note the MatchAndEmitInstruction() isn't the correct abstraction either. I'll be exposing target-independent hooks shortly, so this is really just a cleanup. llvm-svn: 165858
* ARM: tail-call inside a function where part of a byval argument is on caller'sManman Ren2012-10-121-0/+8
| | | | | | | | | | | | | | | | | | | | | local frame causes problem. For example: void f(StructToPass s) { g(&s, sizeof(s)); } will cause problem with tail-call since part of s is passed via registers and saved in f's local frame. When g tries to access s, part of s may be corrupted since f's local frame is popped out before the tail-call. The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for the caller. This is a conservative approach, if we can prove the address of s or part of s is not taken and passed to g, it should be okay to perform tail-call. rdar://12442472 llvm-svn: 165853
* [ms-inline asm] Capitalize per coding standard.Chad Rosier2012-10-121-19/+19
| | | | llvm-svn: 165847
* ARM: Mark VSELECT as 'expand'.Jim Grosbach2012-10-121-0/+1
| | | | | | | | | | | | | The backend already pattern matches to form VBSL when it can. We may want to teach it to use the vbsl intrinsics at some point to prevent machine licm from mucking with this, but using the Expand is completely correct. http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Patch by Peter Couperus <peter.couperus@st.com>. llvm-svn: 165845
OpenPOWER on IntegriCloud