summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Add lswi / stswi for assembler use with a warning to not add patternsJoerg Sonnenberger2014-08-051-0/+10
| | | | | | for them. llvm-svn: 214862
* AArch64: Add support for instruction prefetch intrinsicYi Kong2014-08-051-2/+2
| | | | | | | | | Instruction prefetch is not implemented for AArch64, it is incorrectly translated into data prefetch instruction. Differential Revision: http://reviews.llvm.org/D4777 llvm-svn: 214860
* Teach the SLP Vectorizer that keeping some values live over a callsite can ↵James Molloy2014-08-053-0/+93
| | | | | | | | have a cost. Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account. llvm-svn: 214859
* [x86] Reformat some code I moved around in a prior commit but leftChandler Carruth2014-08-051-3/+3
| | | | | | poorly formatted. Sorry about that. llvm-svn: 214853
* Allow binary and for tblgen math.Joerg Sonnenberger2014-08-054-2/+9
| | | | llvm-svn: 214851
* [x86] Fix a crash and wrong-code bug in the new vector lowering allChandler Carruth2014-08-051-18/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | found by a single test reduced out of a failure on llvm-stress. The start of the problem (and the crash) came when we tried to use a find of a non-used slot in the move-to half of the move-mask as the target for two bad-half inputs. While if lucky this will be the first of a pair of slots which we can place the bad-half inputs into, it isn't actually guaranteed. This really isn't surprising, not sure what I was thinking. The correct way to find the two unused slots is to look for one of the *used* slots. We know it isn't that pair, and we can use some modular arithmetic to find the other pair by masking off the odd bit and adding 2 modulo 4. With this, we reliably found a viable pair of slots for the bad-half inputs. Sadly, that wasn't enough. We also had a wrong code bug that surfaced when I reduced the test case for this where we would use the same slot twice for the two bad inputs. This is because both of the bad inputs could be in odd slots originally and thus the mod-2 mapping would actually be the same. The whole point of the weird indexing into the pair of empty slots was to try to leverage when the end result needed the two bad-half inputs to be paired in a dword and pre-pair them in the correct orrientation. This is less important with the powerful combining we're now doing, and also easier and more reliable to achieve be noting that we add the bad-half inputs in order. Thus, if they are in a dword pair, the low part of that will be the first input in the sequence. Always putting that in the low element will just do the right thing in addition to computing the correct result. Test case added. =] llvm-svn: 214849
* [FastIsel][AArch64] Fix previous commit r214844 (Don't perform ↵Juergen Ributzka2014-08-051-6/+4
| | | | | | | | | | | | | | sign-/zero-extension for function arguments that have already been sign-/zero-extended.) The original code would fail for unsupported value types like i1, i8, and i16. This fix changes the code to only create a sub-register copy for i64 value types and all other types (i1/i8/i16/i32) just use the source register without any modifications. getRegClassFor() is now guarded by the i64 value type check, that guarantees that we always request a register for a valid value type. llvm-svn: 214848
* [FastISel][AArch64] Implement the FastLowerArguments hook.Juergen Ributzka2014-08-051-0/+103
| | | | | | | | | | | | | This implements basic argument lowering for AArch64 in FastISel. It only handles a small subset of the C calling convention. It supports simple arguments that can be passed in GPR and FPR registers. This should cover most of the trivial cases without falling back to SelectionDAG. This fixes <rdar://problem/17890986>. llvm-svn: 214846
* Revert "r214832 - MachineCombiner Pass for selecting faster instruction"Kevin Qin2014-08-055-536/+13
| | | | | | | It broke compiling of most Benchmark and internal test, as clang got clashed by segmentation fault or assertion. llvm-svn: 214845
* [FastISel][AArch64] Don't perform sign-/zero-extension for function ↵Juergen Ributzka2014-08-051-2/+24
| | | | | | arguments that have already been sign-/zero-extended. llvm-svn: 214844
* Provide convenient access to the zext/sext attributes of function arguments. ↵Juergen Ributzka2014-08-051-0/+14
| | | | | | NFC. llvm-svn: 214843
* Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher2014-08-05134-739/+482
| | | | | | | | | | | shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
* MachineCombiner Pass for selecting faster instructionGerolf Hoflehner2014-08-055-13/+536
| | | | | | | | | | | sequence on AArch64 Re-commit of r214669 without changes to test cases LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and LLVM:: CodeGen/AArch64/dp-3source.ll This resolves the reported compfails of the original commit. llvm-svn: 214832
* Add TCR register accessJoerg Sonnenberger2014-08-041-0/+3
| | | | llvm-svn: 214826
* Add PPC 603's tlbld and tlbli instructions.Joerg Sonnenberger2014-08-041-0/+5
| | | | llvm-svn: 214825
* Allow CP10/CP11 operations on ARMv5/v6Renato Golin2014-08-041-3/+7
| | | | | | | | | | | Those registers are VFP/NEON and vector instructions should be used instead, but old cores rely on those co-processors to enable VFP unwinding. This change was prompted by the libc++abi's unwinding routine and is also present in many legacy low-level bare-metal code that we ought to compile/assemble. Fixing bug PR20025 and allowing PR20529 to proceed with a fix in libc++abi. llvm-svn: 214802
* [PPC64LE] Fix wrong IR for vec_sld and vec_vsldoiBill Schmidt2014-08-041-30/+12
| | | | | | | | | | | | | | | My original LE implementation of the vsldoi instruction, with its altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect shufflevector operations in the LLVM IR. Correct code is generated because the back end handles the incorrect shufflevector in a consistent manner. This patch and a companion patch for Clang correct this problem by removing the fixup from altivec.h and the corresponding fixup from the PowerPC back end. Several test cases are also modified to reflect the now-correct LLVM IR. llvm-svn: 214800
* Enable Darwin vararg parameters support in assembler macros.Kevin Enderby2014-08-041-1/+1
| | | | | | | | | Duplicate the vararg tests for linux and add a tests which mixed vararg arguments with darwin positional parameters. Patch by: Janne Grunau <j@jannau.net> llvm-svn: 214799
* Changed the liveness tracking in the RegisterScavengerPedro Artigas2014-08-046-74/+86
| | | | | | | | to use register units instead of registers. reviewed by Jakob Stoklund Olesen. llvm-svn: 214798
* Add simplified aliases for access to DCCR, ICCR, DEAR and ESRJoerg Sonnenberger2014-08-041-0/+12
| | | | llvm-svn: 214797
* [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.Juergen Ributzka2014-08-041-15/+13
| | | | | | | | | | | | | | | This fix changes the parameters #r and #s that are passed to the UBFM/SBFM instruction to get the zero/sign-extension for free. The original problem was that the shift left would use the 32-bit shift even for i8/i16 value types, which could leave the upper bits set with "garbage" values. The arithmetic shift right on the other side would use the wrong MSB as sign-bit to determine what bits to shift into the value. This fixes <rdar://problem/17907720>. llvm-svn: 214788
* [SDAG] Fix a really, really terrible bug in the DAG combiner.Chandler Carruth2014-08-041-2/+2
| | | | | | | | | | | | | | | This code is completely wrong. It is also dead, as if it were to *ever* run, it would crash. Fortunately, after my work to the combiner, it is at least *possible* to reach the code, and llvm-stress has found a test case. Thanks to Patrick for reporting. It would be really good if anyone who remembers how this code works and what it was intended to do could add some more obvious test coverage instead of my completely contrived and reduced test case. My test case was so brittle I left a bread crumb comment in it to help the next person to stumble on it and not know what it was actually testing for. llvm-svn: 214785
* tlbre / tlbwe / tlbsx / tlbsx. variants for the PPC 4xx CPUs.Joerg Sonnenberger2014-08-042-0/+39
| | | | llvm-svn: 214784
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-04289-1666/+2147
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* [AArch64] Extend the number of scalar instructions supported in the AdvSIMDChad Rosier2014-08-041-0/+6
| | | | | | | | | scalar integer instruction pass. This is a patch I had lying around from a few months ago. The pass is currently disabled by default, so nothing to interesting. llvm-svn: 214779
* Fix failure to invoke exception handler on Win64Reid Kleckner2014-08-043-0/+48
| | | | | | | | | | | | | When the last instruction prior to a function epilogue is a call, we need to emit a nop so that the return address is not in the epilogue IP range. This is consistent with MSVC's behavior, and may be a workaround for a bug in the Win64 unwinder. Differential Revision: http://reviews.llvm.org/D4751 Patch by Vadim Chugunov! llvm-svn: 214775
* Recognize mftbl as alias for mftb, for symmetry with mttb.Joerg Sonnenberger2014-08-041-0/+1
| | | | llvm-svn: 214769
* Reapply "DebugInfo: Ensure that all debug location scope chains from ↵David Blaikie2014-08-043-4/+34
| | | | | | | | | | | | | | instructions within a function, lead to the function itself." Originally reverted in r213432 with flakey failures on an ASan self-host build. After reduction it seems to be the same issue fixed in r213805 (ArgPromo + DebugInfo: Handle updating debug info over multiple applications of argument promotion) and r213952 (by having LiveDebugVariables strip dbg_value intrinsics in functions that are not described by debug info). Though I cannot explain why this failure was flakey... llvm-svn: 214761
* R600/SI: Fix definitions for ds_read2 / ds_write2 instructions.Matt Arsenault2014-08-042-3/+4
| | | | | | | These were just wrong, using the wrong register classes and store2 was missing an operand. llvm-svn: 214756
* Rename PPCLinuxMCAsmInfo to PPCELFMCAsmInfo to better reflect theJoerg Sonnenberger2014-08-043-5/+5
| | | | | | systems it represents. llvm-svn: 214755
* Allow .lcomm with alignment on ELF targets.Joerg Sonnenberger2014-08-041-0/+1
| | | | llvm-svn: 214754
* Coverage: add HasCodeBefore flag to a mapping region.Alex Lorenz2014-08-042-7/+15
| | | | | | | | | This flag will be used by the coverage tool to help compute the execution counts for each line in a source file. Differential Revision: http://reviews.llvm.org/D4746 llvm-svn: 214740
* Move the R600 intrinsic support back to the target machine - there'sEric Christopher2014-08-044-6/+4
| | | | | | | nothing subtarget dependent about the intrinsic support in any backend as far as I can tell. llvm-svn: 214738
* Path: Stop claiming path::const_iterator is bidirectionalJustin Bogner2014-08-041-12/+27
| | | | | | | | | | | | | | | | | | | | | | path::const_iterator claims that it's a bidirectional iterator, but it doesn't satisfy all of the contracts for a bidirectional iterator. For example, n3376 24.2.5 p6 says "If a and b are both dereferenceable, then a == b if and only if *a and *b are bound to the same object", but this doesn't work with how we stash and recreate Components. This means that our use of reverse_iterator on this type is invalid and leads to many of the valgrind errors we're hitting, as explained by Tilmann Scheller here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140728/228654.html Instead, we admit that path::const_iterator is only an input_iterator, and implement a second input_iterator for path::reverse_iterator (by changing const_iterator::operator-- to reverse_iterator::operator++). All of the uses of this just traverse once over the path in one direction or the other anyway. llvm-svn: 214737
* Refactor SPRG instructions.Joerg Sonnenberger2014-08-041-35/+16
| | | | llvm-svn: 214733
* [X86] Place parentheses around "isMask_32(STReturns) && N <= 2".Akira Hatanaka2014-08-041-1/+1
| | | | | | This corrects r214672, which was committed to silence a gcc warning. llvm-svn: 214732
* Add support for m[ft][di]bat[ul] instructions.Joerg Sonnenberger2014-08-044-0/+33
| | | | llvm-svn: 214731
* Use the known address space constant rather than checking itMatt Arsenault2014-08-041-1/+1
| | | | llvm-svn: 214729
* R600: Remove unused includeMatt Arsenault2014-08-041-1/+0
| | | | llvm-svn: 214728
* Add a dummy subtarget to the CPP backend target machine. This willEric Christopher2014-08-041-3/+9
| | | | | | | allow us to forward all of the standard TargetMachine calls to the subtarget and still return null as we were before. llvm-svn: 214727
* Add features for PPC 4xx and e500/e500mc instructions.Joerg Sonnenberger2014-08-044-4/+18
| | | | | | Move the test cases for them into separate files. llvm-svn: 214724
* [SKX] Enabling load/store instructions: encodingRobert Khasanov2014-08-043-127/+206
| | | | | | | | Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS, Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 214719
* [PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endianUlrich Weigand2014-08-043-36/+68
| | | | | | | | | | | | | In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl* by swapping the input arguments. As it turns out, the exact same fix is also required for the vpkuhum/vpkuwum patterns. This fixes another regression in llvmpipe when vector support is enabled. Reviewed by Bill Schmidt. llvm-svn: 214718
* Improving the name of the function parameter, which happens to solve two ↵Aaron Ballman2014-08-041-14/+14
| | | | | | likely-less-than-useful MSVC warnings: warning C4258: 'I' : definition from the for loop is ignored; the definition from the enclosing scope is used. llvm-svn: 214717
* [PowerPC] MULHU/MULHS are not legal for vector typesUlrich Weigand2014-08-041-0/+2
| | | | | | | | | | I ran into some test failures where common code changed vector division by constant into a multiply-high operation (MULHU). But these are not implemented by the back-end, so we failed to recognize the insn. Fixed by marking MULHU/MULHS as Expand for vector types. llvm-svn: 214716
* [PowerPC] Fix and improve vector comparisonsUlrich Weigand2014-08-042-150/+111
| | | | | | | | | | | | | | | | | | | | This patch refactors code generation of vector comparisons. This fixes a wrong code-gen bug for ISD::SETGE for floating-point types, and improves generated code for vector comparisons in general. Specifically, the patch moves all logic deciding how to implement vector comparisons into getVCmpInst, which gets two extra boolean outputs indicating to its caller whether its needs to swap the input operands and/or negate the result of the comparison. Apart from implementing these two modifications as directed by getVCmpInst, there is no need to ever implement vector comparisons in any other manner; in particular, there is never a need to perform two separate comparisons (e.g. one for equal and one for greater-than, as code used to do before this patch). Reviewed by Bill Schmidt. llvm-svn: 214714
* [mips] Add assembler support for '.set mipsX'.Daniel Sanders2014-08-043-3/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch also fixes an issue with the way the Mips assembler enables/disables architecture features. Before this patch, the assembler never disabled feature bits. For example, .set mips64 .set mips32r2 would result in the 'OR' of mips64 with mips32r2 feature bits which isn't right. Unfortunately this isn't trivial to fix because there's not an easy way to clear feature bits as the algorithm in MCSubtargetInfo (ToggleFeature) only clears the bits that imply the feature being cleared and not the implied bits by the feature (there's a better explanation to the code I added). Patch by Matheus Almeida and updated by Toma Tabacu Reviewers: vmedic, matheusalmeida, dsanders Reviewed By: dsanders Subscribers: tomatabacu, llvm-commits Differential Revision: http://reviews.llvm.org/D4123 llvm-svn: 214709
* [x86] Just unilaterally prefer SSSE3-style PSHUFB lowerings over cleverChandler Carruth2014-08-041-35/+35
| | | | | | | | | | | | | | | use of PACKUS. It's cleaner that way. I looked at implementing clever combine-based folding of PACKUS chains into PSHUFB but it is quite hard and doesn't seem likely to be worth it. The most annoying part would be detecting that the correct masking had been done to use PACKUS-style instructions as a blend operation rather than there being any saturating as is indicated by its name. We generate really nice code for what few test cases I've come up with that aren't completely contrived for this by just directly prefering PSHUFB and so let's go with that strategy for now. =] llvm-svn: 214707
* [x86] Implement more aggressive use of PACKUS chains for lowering commonChandler Carruth2014-08-041-0/+106
| | | | | | | | | | | | | | patterns of v16i8 shuffles. This implements one of the more important FIXMEs for the SSE2 support in the new shuffle lowering. We now generate the optimal shuffle sequence for truncate-derived shuffles which show up essentially everywhere. Unfortunately, this exposes a weakness in other parts of the shuffle logic -- we can no longer form PSHUFB here. I'll add the necessary support for that and other things in a subsequent commit. llvm-svn: 214702
* Revert "r214669 - MachineCombiner Pass for selecting faster instruction"Kevin Qin2014-08-045-536/+13
| | | | | | This commit broke "make check" for several hours, so get it reverted. llvm-svn: 214697
OpenPOWER on IntegriCloud