summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [FastISel] Undo phi node updates when falling-back to SelectionDAG.Juergen Ributzka2014-08-283-4/+34
| | | | | | | | | | | | | | | | | | | | The included test case would fail, because the MI PHI node would have two operands from the same predecessor. This problem occurs when a switch instruction couldn't be selected. This happens always, because there is no default switch support for FastISel to begin with. The problem was that FastISel would first add the operand to the PHI nodes and then fall-back to SelectionDAG, which would then in turn add the same operands to the PHI nodes again. This fix removes these duplicate PHI node operands by reseting the PHINodesToUpdate to its original state before FastISel tried to select the instruction. This fixes <rdar://problem/18155224>. llvm-svn: 216640
* [FastISel]Juergen Ributzka2014-08-283-2/+19
| | | | | | | | | | | | | | | | | | | | Currently instructions are folded very aggressively for AArch64 into the memory operation, which can lead to the use of killed operands: %vreg1<def> = ADDXri %vreg0<kill>, 2 %vreg2<def> = LDRBBui %vreg0, 2 ... = ... %vreg1 ... This usually happens when the result is also used by another non-memory instruction in the same basic block, or any instruction in another basic block. This fix teaches hasTrivialKill to not only check the LLVM IR that the value has a single use, but also to check if the register that represents that value has already been used. This can happen when the instruction with the use was folded into another instruction (in this particular case a load instruction). This fixes rdar://problem/18142857. llvm-svn: 216634
* Revert "[FastISel][AArch64] Don't fold instructions too aggressively into ↵Juergen Ributzka2014-08-272-222/+16
| | | | | | | | the memory operation." Quentin pointed out that this is not the correct approach and there is a better and easier solution. llvm-svn: 216632
* Fix unaligned reads/writes in X86JIT and RuntimeDyldELF.Alexey Samsonov2014-08-273-49/+78
| | | | | | | | | | | | | | | | Summary: Introduce support::ulittleX_t::ref type to Support/Endian.h and use it in x86 JIT to enforce correct endianness and fix unaligned accesses. Test Plan: regression test suite Reviewers: lhames Subscribers: ributzka, llvm-commits Differential Revision: http://reviews.llvm.org/D5011 llvm-svn: 216631
* [FastISel][AArch64] Don't fold instructions too aggressively into the memory ↵Juergen Ributzka2014-08-272-16/+222
| | | | | | | | | | | | | | | | | | | | | operation. Currently instructions are folded very aggressively into the memory operation, which can lead to the use of killed operands: %vreg1<def> = ADDXri %vreg0<kill>, 2 %vreg2<def> = LDRBBui %vreg0, 2 ... = ... %vreg1 ... This usually happens when the result is also used by another non-memory instruction in the same basic block, or any instruction in another basic block. If the computed address is used by only memory operations in the same basic block, then it is safe to fold them. This is because all memory operations will fold the address computation and the original computation will never be emitted. This fixes rdar://problem/18142857. llvm-svn: 216629
* Avoid zero length memset errorRenato Golin2014-08-271-2/+4
| | | | | | | Adding a check on buffer lenght to avoid a __warn_memset_zero_len warning on GCC 4.8.2. llvm-svn: 216624
* Use local variable in visitFADD. No functional change.Sanjay Patel2014-08-271-13/+11
| | | | llvm-svn: 216623
* [FastISel][AArch64] Fix a comment in my previous commit (r216617).Juergen Ributzka2014-08-271-1/+1
| | | | llvm-svn: 216622
* [FastISel][AArch64] Fix simplify address when the address comes from a shift.Juergen Ributzka2014-08-272-0/+25
| | | | | | | | | When the address comes directly from a shift instruction then the address computation cannot be folded into the memory instruction, because the zero register is not available as a base register. Simplify addess needs to emit the shift instruction and use the result as base register. llvm-svn: 216621
* Fix a double free in llvm::getBitcodeTargetTriple.Rafael Espindola2014-08-271-1/+1
| | | | | | Unfortunately this is only used by ld64, so no testcase, but should fix the darwin LTO bootstrap. llvm-svn: 216618
* [FastISel][AArch64] Use the zero register for stores.Juergen Ributzka2014-08-273-15/+34
| | | | | | | | | Use the zero register directly when possible to avoid an unnecessary register copy and a wasted register at -O0. This also uses integer stores to store a positive floating-point zero. This saves us from materializing the positive zero in a register and then storing it. llvm-svn: 216617
* Group unsafe-math optimizations for fsub into one block. No functional change.Sanjay Patel2014-08-271-14/+17
| | | | llvm-svn: 216616
* [FastISel] Fix a potential bug in FastEmitInst_riJuergen Ributzka2014-08-271-2/+1
| | | | | | | | FastEmitInst_ri was constraining the first operand without checking if it is a virtual register. Use constrainOperandRegClass as all the other FastEmitInst_* functions. llvm-svn: 216613
* Use local variable to improve readability. Sanjay Patel2014-08-271-15/+10
| | | | | | No functional change intended. llvm-svn: 216611
* typo in commentSanjay Patel2014-08-271-1/+1
| | | | llvm-svn: 216609
* Don't create a MemoryBuffer just to get the MemoryBufferRef. NFC.Rafael Espindola2014-08-271-6/+6
| | | | llvm-svn: 216608
* Convert a few more cases of direct intialization of unique_ptrs from ↵David Blaikie2014-08-273-12/+12
| | | | | | | | MemoryBuffer::getMemBuffer to move initialization now that it returns by unique_ptr instead of raw pointer. Cleanup/improvements following r216583. llvm-svn: 216605
* X86 MC: Handle instructions like fxsave that match multiple operand sizesReid Kleckner2014-08-273-10/+44
| | | | | | | | | | | | | | | | Instructions like 'fxsave' and control flow instructions like 'jne' match any operand size. The loop I added to the Intel syntax matcher assumed that using a different size would give a different instruction. Now it handles the case where we get the same instruction for different memory operand sizes. This also allows us to remove the hack we had for unsized absolute memory operands, because we can successfully match things like 'jnz' without reporting ambiguity. Removing this hack uncovered test case involving 'fadd' that was ambiguous. The memory operand could have been single or double precision. llvm-svn: 216604
* InstCombine: Combine gep X, (Y-X) to YDavid Majnemer2014-08-272-14/+38
| | | | | | | | We try to perform this transform in InstSimplify but we aren't always able to. Sometimes, we need to insert a bitcast if X and Y don't have the same time. llvm-svn: 216598
* InstSimplify: Don't simplify gep X, (Y-X) to Y if types differDavid Majnemer2014-08-272-1/+16
| | | | | | | | | It's incorrect to perform this simplification if the types differ. A bitcast would need to be inserted for this to work. This fixes PR20771. llvm-svn: 216597
* Reland r216439 215441, majnemer has a real fix for PR20771.Nico Weber2014-08-273-11/+142
| | | | llvm-svn: 216586
* Return a std::unique_ptr when creating a new MemoryBuffer.Rafael Espindola2014-08-2716-89/+81
| | | | llvm-svn: 216583
* Revert r216439 (and r216441, else the former doesn't revert cleanly).Nico Weber2014-08-273-142/+11
| | | | | | It caused PR 20771. I'll land a test on the clang side. llvm-svn: 216582
* Remove unused argument.Rafael Espindola2014-08-271-9/+6
| | | | llvm-svn: 216580
* Use BitVector instead of int in R600 SIISelLowering.Alexey Samsonov2014-08-271-3/+4
| | | | | | | int may not have enough bits in it, which was detected by UBSan bootstrap (it reported left shift by a too large constant). llvm-svn: 216579
* yaml::Stream doesn't need to take ownership of the buffer.Rafael Espindola2014-08-273-28/+29
| | | | | | In fact, most users were already using the StringRef version. llvm-svn: 216575
* Fix some semantic usability issues with DynamicLibrary.Zachary Turner2014-08-271-2/+3
| | | | | | | | | | This patch allows invalid DynamicLibrary instances to be constructed, and fixes the const-correctness of the isValid() method. No functional change. llvm-svn: 216571
* InstSimplify: Compute comparison ranges for left shift instructionsDavid Majnemer2014-08-272-0/+43
| | | | | | | | 'shl nuw CI, x' produces [CI, CI << CLZ(CI)] 'shl nsw CI, x' produces [CI << CLO(CI)-1, CI] if CI is negative 'shl nsw CI, x' produces [CI, CI << CLZ(CI)-1] if CI is non-negative llvm-svn: 216570
* Revert "Limit the symbol search in DynamicLibrary to the module that was ↵Zachary Turner2014-08-271-9/+2
| | | | | | | | opened." This reverts commit r216563, which breaks lli's dynamic symbol resolution. llvm-svn: 216569
* [MCJIT] Replace a C-style cast in RuntimeDyldImpl.h.Lang Hames2014-08-271-1/+1
| | | | llvm-svn: 216568
* [MCJIT] More endianness fixes for RuntimeDyldMachO.Lang Hames2014-08-272-12/+28
| | | | | | http://llvm.org/PR20640 llvm-svn: 216567
* Limit the symbol search in DynamicLibrary to the module that was opened.Zachary Turner2014-08-271-2/+9
| | | | | | | | Differential Revision: http://reviews.llvm.org/D5030 Reviewed By: Reid Kleckner, Rafael Espindola llvm-svn: 216563
* Teach the AArch64 backend about v4f16 and v8f16Oliver Stannard2014-08-2715-132/+1893
| | | | | | | | This teaches the AArch64 backend to deal with the operations required to deal with the operations on v4f16 and v8f16 which are exposed by NEON intrinsics, plus the add, sub, mul and div operations. llvm-svn: 216555
* [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).Michael Zolotukhin2014-08-272-0/+142
| | | | llvm-svn: 216549
* Clang-format over X86AsmInstrumentation.* with LLVM style.Evgeniy Stepanov2014-08-272-129/+132
| | | | | | r216536 mistakenly used -style=Google instead of LLVM. llvm-svn: 216543
* Add an explicit cast to pacify implicit boolean conversion warnings.Benjamin Kramer2014-08-271-1/+1
| | | | llvm-svn: 216539
* [x86] Fix a regression introduced with r213897 for 32-bit targets whereChandler Carruth2014-08-272-6/+15
| | | | | | | | | | | | we stopped efficiently lowering sextload using the SSE41 instructions for that operation. This is a consequence of a bad predicate I used thinking of the memory access needs. The code actually handles the cases where the predicate doesn't apply, and handles them much better. =] Simple fix and a test case added. Fixes PR20767. llvm-svn: 216538
* [SDAG] Re-instate r215611 with a fix to a pesky X86 DAG combine.Chandler Carruth2014-08-274-17/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This combine is essentially combining target-specific nodes back into target independent nodes that it "knows" will be combined yet again by a target independent DAG combine into a different set of target-independent nodes that are legal (not custom though!) and thus "ok". This seems... deeply flawed. The crux of the problem is that we don't combine un-legalized shuffles that are introduced by legalizing other operations, and thus we don't see a very profitable combine opportunity. So the backend just forces the input to that combine to re-appear. However, for this to work, the conditions detected to re-form the unlegalized nodes must be *exactly* right. Previously, failing this would have caused poor code (if you're lucky) or a crasher when we failed to select instructions. After r215611 we would fall back into the legalizer. In some cases, this just "fixed" the crasher by produces bad code. But in the test case added it caused the legalizer and the dag combiner to iterate forever. The fix is to make the alignment checking in the x86 side of things match the alignment checking in the generic DAG combine exactly. This isn't really a satisfying or principled fix, but it at least make the code work as intended. It also highlights that it would be nice to detect the availability of under aligned loads for a given type rather than bailing on this optimization. I've left a FIXME to document this. Original commit message for r215611 which covers the rest of the chang: [SDAG] Fix a case where we would iteratively legalize a node during combining by replacing it with something else but not re-process the node afterward to remove it. In a truly remarkable stroke of bad luck, this would (in the test case attached) end up getting some other node combined into it without ever getting re-processed. By adding it back on to the worklist, in addition to deleting the dead nodes more quickly we also ensure that if it *stops* being dead for any reason it makes it back through the legalizer. Without this, the test case will end up failing during instruction selection due to an and node with a type we don't have an instruction pattern for. It took many million runs of the shuffle fuzz tester to find this. llvm-svn: 216537
* Clang-format over X86AsmInstrumentation.*.Evgeniy Stepanov2014-08-272-183/+216
| | | | llvm-svn: 216536
* [SKX] Added new versions of cmp instructions in avx512_icmp_cc multiclass, ↵Robert Khasanov2014-08-275-34/+1252
| | | | | | | | added VL multiclass. Added encoding tests llvm-svn: 216532
* AVX-512: Added intrinsic for VMOVSS store form with mask.Elena Demikhovsky2014-08-272-0/+19
| | | | llvm-svn: 216530
* Simplify creation of a bunch of ArrayRefs by using None, makeArrayRef or ↵Craig Topper2014-08-2740-131/+102
| | | | | | just letting them be implicitly created. llvm-svn: 216525
* Fix some cases were ArrayRefs were being passed by reference. Also remove ↵Craig Topper2014-08-275-8/+8
| | | | | | 'const' from some other ArrayRef uses since its implicitly const already. llvm-svn: 216524
* InstCombine: Optimize GEP's involving ptrtoint betterDavid Majnemer2014-08-272-11/+74
| | | | | | | | | | | | | | We supported transforming: (gep i8* X, -(ptrtoint Y)) to: (inttoptr (sub (ptrtoint X), (ptrtoint Y))) However, this only fired if 'X' had type i8*. Generalize this to support various types of different sizes. This results in much better CodeGen, especially for pointers to packed structs. llvm-svn: 216523
* Remove type unit skeletons. GDB no longer needs them & this saves a heap of ↵David Blaikie2014-08-272-40/+7
| | | | | | space. llvm-svn: 216521
* [FastISel][AArch64] Fix address simplification.Juergen Ributzka2014-08-272-8/+130
| | | | | | | | | | | | When a shift with extension or an add with shift and extension cannot be folded into the memory operation, then the address calculation has to be materialized separately. While doing so the code forgot to consider a possible sign-/zero- extension. This fix folds now also the sign-/zero-extension into the add or shift instruction which is used to materialize the address. This fixes rdar://problem/18141718. llvm-svn: 216511
* [FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction.Juergen Ributzka2014-08-272-55/+419
| | | | llvm-svn: 216510
* Fix a couple of debug info test cases to match the metadata schema change in ↵David Blaikie2014-08-272-2/+2
| | | | | | | | r216239 Found these while testing something else. llvm-svn: 216505
* Pass a std::unique_ptr<MemoryBuffer>& to getLazyBitcodeModule.Rafael Espindola2014-08-268-22/+22
| | | | | | | By taking a reference we can do the ownership transfer in one place instead of expecting every caller to do it. llvm-svn: 216492
* Pass a MemoryBufferRef when we can avoid taking ownership.Rafael Espindola2014-08-2616-71/+58
| | | | | | | | | | | | | The attached patch simplifies a few interfaces that don't need to take ownership of a buffer. For example, both parseAssembly and parseBitcodeFile will parse the entire buffer before returning. There is no need to take ownership. Using a MemoryBufferRef makes it obvious in the type signature that there is no ownership transfer. llvm-svn: 216488
OpenPOWER on IntegriCloud