summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-04192-1144/+1420
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* [AArch64] Extend the number of scalar instructions supported in the AdvSIMDChad Rosier2014-08-041-0/+6
| | | | | | | | | scalar integer instruction pass. This is a patch I had lying around from a few months ago. The pass is currently disabled by default, so nothing to interesting. llvm-svn: 214779
* Fix failure to invoke exception handler on Win64Reid Kleckner2014-08-043-0/+48
| | | | | | | | | | | | | When the last instruction prior to a function epilogue is a call, we need to emit a nop so that the return address is not in the epilogue IP range. This is consistent with MSVC's behavior, and may be a workaround for a bug in the Win64 unwinder. Differential Revision: http://reviews.llvm.org/D4751 Patch by Vadim Chugunov! llvm-svn: 214775
* Recognize mftbl as alias for mftb, for symmetry with mttb.Joerg Sonnenberger2014-08-041-0/+1
| | | | llvm-svn: 214769
* R600/SI: Fix definitions for ds_read2 / ds_write2 instructions.Matt Arsenault2014-08-042-3/+4
| | | | | | | These were just wrong, using the wrong register classes and store2 was missing an operand. llvm-svn: 214756
* Rename PPCLinuxMCAsmInfo to PPCELFMCAsmInfo to better reflect theJoerg Sonnenberger2014-08-043-5/+5
| | | | | | systems it represents. llvm-svn: 214755
* Allow .lcomm with alignment on ELF targets.Joerg Sonnenberger2014-08-041-0/+1
| | | | llvm-svn: 214754
* Move the R600 intrinsic support back to the target machine - there'sEric Christopher2014-08-044-6/+4
| | | | | | | nothing subtarget dependent about the intrinsic support in any backend as far as I can tell. llvm-svn: 214738
* Refactor SPRG instructions.Joerg Sonnenberger2014-08-041-35/+16
| | | | llvm-svn: 214733
* [X86] Place parentheses around "isMask_32(STReturns) && N <= 2".Akira Hatanaka2014-08-041-1/+1
| | | | | | This corrects r214672, which was committed to silence a gcc warning. llvm-svn: 214732
* Add support for m[ft][di]bat[ul] instructions.Joerg Sonnenberger2014-08-044-0/+33
| | | | llvm-svn: 214731
* Use the known address space constant rather than checking itMatt Arsenault2014-08-041-1/+1
| | | | llvm-svn: 214729
* R600: Remove unused includeMatt Arsenault2014-08-041-1/+0
| | | | llvm-svn: 214728
* Add a dummy subtarget to the CPP backend target machine. This willEric Christopher2014-08-041-3/+9
| | | | | | | allow us to forward all of the standard TargetMachine calls to the subtarget and still return null as we were before. llvm-svn: 214727
* Add features for PPC 4xx and e500/e500mc instructions.Joerg Sonnenberger2014-08-044-4/+18
| | | | | | Move the test cases for them into separate files. llvm-svn: 214724
* [SKX] Enabling load/store instructions: encodingRobert Khasanov2014-08-043-127/+206
| | | | | | | | Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS, Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 214719
* [PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endianUlrich Weigand2014-08-043-36/+68
| | | | | | | | | | | | | In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl* by swapping the input arguments. As it turns out, the exact same fix is also required for the vpkuhum/vpkuwum patterns. This fixes another regression in llvmpipe when vector support is enabled. Reviewed by Bill Schmidt. llvm-svn: 214718
* Improving the name of the function parameter, which happens to solve two ↵Aaron Ballman2014-08-041-14/+14
| | | | | | likely-less-than-useful MSVC warnings: warning C4258: 'I' : definition from the for loop is ignored; the definition from the enclosing scope is used. llvm-svn: 214717
* [PowerPC] MULHU/MULHS are not legal for vector typesUlrich Weigand2014-08-041-0/+2
| | | | | | | | | | I ran into some test failures where common code changed vector division by constant into a multiply-high operation (MULHU). But these are not implemented by the back-end, so we failed to recognize the insn. Fixed by marking MULHU/MULHS as Expand for vector types. llvm-svn: 214716
* [PowerPC] Fix and improve vector comparisonsUlrich Weigand2014-08-042-150/+111
| | | | | | | | | | | | | | | | | | | | This patch refactors code generation of vector comparisons. This fixes a wrong code-gen bug for ISD::SETGE for floating-point types, and improves generated code for vector comparisons in general. Specifically, the patch moves all logic deciding how to implement vector comparisons into getVCmpInst, which gets two extra boolean outputs indicating to its caller whether its needs to swap the input operands and/or negate the result of the comparison. Apart from implementing these two modifications as directed by getVCmpInst, there is no need to ever implement vector comparisons in any other manner; in particular, there is never a need to perform two separate comparisons (e.g. one for equal and one for greater-than, as code used to do before this patch). Reviewed by Bill Schmidt. llvm-svn: 214714
* [mips] Add assembler support for '.set mipsX'.Daniel Sanders2014-08-043-3/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch also fixes an issue with the way the Mips assembler enables/disables architecture features. Before this patch, the assembler never disabled feature bits. For example, .set mips64 .set mips32r2 would result in the 'OR' of mips64 with mips32r2 feature bits which isn't right. Unfortunately this isn't trivial to fix because there's not an easy way to clear feature bits as the algorithm in MCSubtargetInfo (ToggleFeature) only clears the bits that imply the feature being cleared and not the implied bits by the feature (there's a better explanation to the code I added). Patch by Matheus Almeida and updated by Toma Tabacu Reviewers: vmedic, matheusalmeida, dsanders Reviewed By: dsanders Subscribers: tomatabacu, llvm-commits Differential Revision: http://reviews.llvm.org/D4123 llvm-svn: 214709
* [x86] Just unilaterally prefer SSSE3-style PSHUFB lowerings over cleverChandler Carruth2014-08-041-35/+35
| | | | | | | | | | | | | | | use of PACKUS. It's cleaner that way. I looked at implementing clever combine-based folding of PACKUS chains into PSHUFB but it is quite hard and doesn't seem likely to be worth it. The most annoying part would be detecting that the correct masking had been done to use PACKUS-style instructions as a blend operation rather than there being any saturating as is indicated by its name. We generate really nice code for what few test cases I've come up with that aren't completely contrived for this by just directly prefering PSHUFB and so let's go with that strategy for now. =] llvm-svn: 214707
* [x86] Implement more aggressive use of PACKUS chains for lowering commonChandler Carruth2014-08-041-0/+106
| | | | | | | | | | | | | | patterns of v16i8 shuffles. This implements one of the more important FIXMEs for the SSE2 support in the new shuffle lowering. We now generate the optimal shuffle sequence for truncate-derived shuffles which show up essentially everywhere. Unfortunately, this exposes a weakness in other parts of the shuffle logic -- we can no longer form PSHUFB here. I'll add the necessary support for that and other things in a subsequent commit. llvm-svn: 214702
* Revert "r214669 - MachineCombiner Pass for selecting faster instruction"Kevin Qin2014-08-045-536/+13
| | | | | | This commit broke "make check" for several hours, so get it reverted. llvm-svn: 214697
* [x86] Handle single input shuffles in the SSSE3 case more intelligently.Chandler Carruth2014-08-041-0/+4
| | | | | | | | | | I spent some time looking into a better or more principled way to handle this. For example, by detecting arbitrary "unneeded" ORs... But really, there wasn't any point. We just shouldn't build blatantly wrong code so late in the pipeline rather than adding more stages and logic later on to fix it. Avoiding this is just too simple. llvm-svn: 214680
* X86: silence warning (-Wparentheses)Saleem Abdulrasool2014-08-031-1/+1
| | | | | | | | | GCC 4.8.2 points out the ambiguity in evaluation of the assertion condition: lib/Target/X86/X86FloatingPoint.cpp:949:49: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses] assert(STReturns == 0 || isMask_32(STReturns) && N <= 2); llvm-svn: 214672
* MachineCombiner Pass for selecting faster instructionGerolf Hoflehner2014-08-035-13/+536
| | | | | | | | | | | | | | | | | | | | | | sequence - AArch64 target support This patch turns off madd/msub generation in the DAGCombiner and generates them in the MachineCombiner instead. It replaces the original code sequence with the combined sequence when it is beneficial to do so. When there is no machine model support it always generates the madd/msub instruction. This is true also when the objective is to optimize for code size: when the combined sequence is shorter is always chosen and does not get evaluated. When there is a machine model the combined instruction sequence is evaluated for critical path and resource length using machine trace metrics and the original code sequence is replaced when it is determined to be faster. rdar://16319955 llvm-svn: 214669
* MC: virtualise EmitWindowsUnwindTablesSaleem Abdulrasool2014-08-031-0/+7
| | | | | | | | | This makes EmitWindowsUnwindTables a virtual function and lowers the implementation of the function to the X86WinCOFFStreamer. This method is a target specific operation. This enables making the behaviour target dependent by isolating it entirely to the target specific streamer. llvm-svn: 214664
* R600/SI: Fix extra whitespace in asm strMatt Arsenault2014-08-031-1/+1
| | | | | | | | | This slipped in in r214467, so something like V_MOV_B32_e32 v0, ... is now printed with 2 spaces between the instruction name and first operand. llvm-svn: 214660
* tlbia supportJoerg Sonnenberger2014-08-022-0/+4
| | | | llvm-svn: 214640
* mfdcr / mtdcr supportJoerg Sonnenberger2014-08-021-0/+5
| | | | llvm-svn: 214639
* Don't use additional arguments for dss and friends to satisfy DSS_Form,Joerg Sonnenberger2014-08-022-63/+54
| | | | | | | | | when let can do the same thing. Keep the 64bit variants as codegen-only. While they have a different register class, the encoding is the same for 32bit and 64bit mode. Having both present would otherwise confuse the disassembler. llvm-svn: 214636
* [x86] Remove the FIXME that was implemented in r214628. Managed toChandler Carruth2014-08-021-4/+0
| | | | | | forget to update the comment here... =/ llvm-svn: 214630
* [x86] Largely complete the use of PSHUFB in the new vector shuffleChandler Carruth2014-08-023-14/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lowering with a small addition to it and adding PSHUFB combining. There is one obvious place in the new vector shuffle lowering where we should form PSHUFBs directly: when without them we will unpack a vector of i8s across two different registers and do a potentially 4-way blend as i16s only to re-pack them into i8s afterward. This is the crazy expensive fallback path for i8 shuffles and we can just directly use pshufb here as it will always be cheaper (the unpack and pack are two instructions so even a single shuffle between them hits our three instruction limit for forming PSHUFB). However, this doesn't generate very good code in many cases, and it leaves a bunch of common patterns not using PSHUFB. So this patch also adds support for extracting a shuffle mask from PSHUFB in the X86 lowering code, and uses it to handle PSHUFBs in the recursive shuffle combining. This allows us to combine through them, combine multiple ones together, and generally produce sufficiently high quality code. Extracting the PSHUFB mask is annoyingly complex because it could be either pre-legalization or post-legalization. At least this doesn't have to deal with re-materialized constants. =] I've added decode routines to handle the different patterns that show up at this level and we dispatch through them as appropriate. The two primary test cases are updated. For the v16 test case there is still a lot of room for improvement. Since I was going through it systematically I left behind a bunch of FIXME lines that I'm hoping to turn into ALL lines by the end of this. llvm-svn: 214628
* [x86] Switch to using the variable we extracted this operand into.Chandler Carruth2014-08-021-1/+1
| | | | | | | Spotted this missed refactoring by inspection when reading code, and it doesn't changethe functionality at all. llvm-svn: 214627
* [x86] Fix a few typos in my comments spotted in passing.Chandler Carruth2014-08-021-3/+3
| | | | llvm-svn: 214626
* [x86] Teach the target shuffle mask extraction to recognize unary formsChandler Carruth2014-08-021-6/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of normally binary shuffle instructions like PUNPCKL and MOVLHPS. This detects cases where a single register is used for both operands making the shuffle behave in a unary way. We detect this and adjust the mask to use the unary form which allows the existing DAG combine for shuffle instructions to actually work at all. As a consequence, this uncovered a number of obvious bugs in the existing DAG combine which are fixed. It also now canonicalizes several shuffles even with the existing lowering. These typically are trying to match the shuffle to the domain of the input where before we only really modeled them with the floating point variants. All of the cases which change to an integer shuffle here have something in the integer domain, so there are no more or fewer domain crosses here AFAICT. Technically, it might be better to go from a GPR directly to the floating point domain, but detecting floating point *outputs* despite integer inputs is a lot more code and seems unlikely to be worthwhile in practice. If folks are seeing domain-crossing regressions here though, let me know and I can hack something up to fix it. Also as a consequence, a bunch of missed opportunities to form pshufb now can be formed. Notably, splats of i8s now form pshufb. Interestingly, this improves the existing splat lowering too. We go from 3 instructions to 1. Yes, we may tie up a register, but it seems very likely to be worth it, especially if splatting the 0th byte (the common case) as then we can use a zeroed register as the mask. llvm-svn: 214625
* [x86] Teach my pshufb comment printer to handle VPSHUFB forms as well asChandler Carruth2014-08-021-2/+3
| | | | | | | PSHUFB forms. This will be important to update some AVX tests when I add PSHUFB combining. llvm-svn: 214624
* [ARM] In dynamic-no-pic mode, ARM's post-RA pseudo expansion was incorrectlyAkira Hatanaka2014-08-023-9/+9
| | | | | | | | | expanding pseudo LOAD_STATCK_GUARD using instructions that are normally used in pic mode. This patch fixes the bug. <rdar://problem/17886592> llvm-svn: 214614
* R600/SI: Fix formatting.Matt Arsenault2014-08-021-22/+28
| | | | | | Avoid weird line wrapping of BuildMI dest register. llvm-svn: 214608
* [SDAG] Allow the legalizer to delete an illegally typed intermediateChandler Carruth2014-08-021-3/+4
| | | | | | | | | | | | introduced during legalization. This pattern is based on other patterns in the legalizer that I changed in the same way. Now, the legalizer eagerly collects its garbage when necessary so that we can survive leaving such nodes around for it. Instead, we add an assert to make sure the node will be correctly handled by that layer. llvm-svn: 214602
* [SDAG] Let the DAG combiner take care of dead nodes rather than manuallyChandler Carruth2014-08-021-2/+0
| | | | | | | deleting them. This already seems to work, as no tests fail without this. llvm-svn: 214601
* [X86] Simplify X87 stackifier pass.Akira Hatanaka2014-08-017-308/+196
| | | | | | | | | | | | | | | | | | | Stop using ST registers for function returns and inline-asm instructions and use FP registers instead. This allows removing a large amount of code in the stackifier pass that was needed to track register liveness and handle copies between ST and FP registers and function calls returning floating point values. It also fixes a bug which manifests when an ST register defined by an inline-asm instruction was live across another inline-asm instruction, as shown in the following sequence of machine instructions: 1. INLINEASM <es:frndint> $0:[regdef], %ST0<imp-def,tied5> 2. INLINEASM <es:fldcw $0> 3. %FP0<def> = COPY %ST0 <rdar://problem/16952634> llvm-svn: 214580
* [SDAG] MorphNodeTo recursively deletes dead operands of the oldChandler Carruth2014-08-012-9/+9
| | | | | | | | | | | | | fromulation of the node, which isn't really the desired behavior from within the combiner or legalizer, but is necessary within ISel. I've added a hopefully helpful comment and fixed the only two places where this took place. Yet another step toward the combiner and legalizer not needing to use update listeners with virtual calls to manage the worklists behind legalization and combining. llvm-svn: 214574
* Revert "R600: Move code for generating REGISTER_LOAD into R600ISelLowering.cpp"Tom Stellard2014-08-012-42/+37
| | | | | | | | This reverts commit r214566. I did not mean to commit this yet. llvm-svn: 214572
* R600/SI: Remove leftover debugging codeTom Stellard2014-08-011-2/+0
| | | | llvm-svn: 214569
* R600: Move code for generating REGISTER_LOAD into R600ISelLowering.cppTom Stellard2014-08-012-37/+42
| | | | | | | SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code path for 8-bit and 16-bit private loads. llvm-svn: 214566
* Add a non-const subtarget returning function to the target machineEric Christopher2014-08-015-5/+16
| | | | | | | | | so that we can use it to get the old-style JIT out of the subtarget. This code should be removed when the old-style JIT is removed (imminently). llvm-svn: 214560
* MS inline asm: Use memory constraints for functions instead of registersReid Kleckner2014-08-011-9/+21
| | | | | | | | | | | | | | | | | | | | | | This is consistent with how we parse them in a standalone .s file, and inline assembly shouldn't differ. This fixes errors about requiring more registers than available in cases like this: void f(); void __declspec(naked) g() { __asm pusha __asm call f __asm popa __asm ret } There are no registers available to pass the address of 'f' into the asm blob. The asm should now directly call 'f'. Tests will land in Clang shortly. llvm-svn: 214550
* [FastISel][AArch64] Fold offset into the memory operation.Juergen Ributzka2014-08-011-0/+7
| | | | | | | | | | | | Fold simple offsets into the memory operation: add x0, x0, #8 ldr x0, [x0] --> ldr x0, [x0, #8] Fixes <rdar://problem/17887945>. llvm-svn: 214545
OpenPOWER on IntegriCloud