summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [x86] Teach the target combine step to aggressively fold pshufd insturcions.Chandler Carruth2014-06-271-11/+77
| | | | | | | | | | | | | Summary: This allows it to fold pshufd instructions across intervening half-shuffles and other noise. This pattern actually shows up in the generic lowering tests, but I've also added direct tests using intrinsics to make sure that the specific desired functionality is working even if the lowering stuff changes in the future. Differential Revision: http://reviews.llvm.org/D4292 llvm-svn: 211892
* [x86] Teach the target-specific combining how to aggressively foldChandler Carruth2014-06-271-0/+90
| | | | | | | | | | | | | | | | | | half-shuffles, even looking through intervening instructions in a chain. Summary: This doesn't happen to show up with any test cases I've found for the current shuffle lowering, but previous attempts would benefit from this and it seems generally useful. I've tested it directly using intrinsics, which also shows that it will work with hand vectorized code as well. Note that even though pshufd isn't directly used in these tests, it gets exercised because we combine some of the half shuffles into a pshufd first, and then merge them. Differential Revision: http://reviews.llvm.org/D4291 llvm-svn: 211890
* [x86] Teach the X86 backend to DAG-combine SSE2 shuffles that areChandler Carruth2014-06-271-1/+101
| | | | | | | | | | | | | | | | | trivially redundant. This fixes several cases in the new vector shuffle lowering algorithm which would generate redundant shuffle instructions for the sake of simplicity. I'm also deleting a testcase which was somewhat ridiculous. It was checking for a bug in 2007 about incorrectly transforming shuffles by looking for the string "-86" in the output of a pretty substantial function. This test case doesn't seem to have any value at this point. Differential Revision: http://reviews.llvm.org/D4240 llvm-svn: 211889
* [x86] Begin a significant overhaul of how vector lowering is done in theChandler Carruth2014-06-271-0/+1029
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86 backend. This sketches out a new code path for vector lowering, hidden behind an off-by-default flag while it is under development. The fundamental idea behind the new code path is to aggressively break down the problem space in ways that ease selecting the odd set of instructions available on x86, and carefully avoid scalarizing code even when forced to use older ISAs. Notably, this starts off restricting itself to SSE2 and implements the complete vector shuffle and blend space for 128-bit vectors in SSE2 without scalarizing. The plan is to layer on top of this ISA extensions where we can bail out of the complex SSE2 lowering and opt for a cheaper, specialized instruction (or set of instructions). It also needs to be generalized to AVX and AVX512 vector widths. Currently, this does a decent but not perfect job for SSE2. There are some specific shortcomings that I plan to address: - We need a peephole combine to fold together shuffles where possible. There are cases where a previous shuffle could be modified slightly to arrange for elements to be in the correct position and a later shuffle eliminated. Doing this eagerly added quite a bit of complexity, and so my plan is to combine away these redundancies afterward. - There are a lot more clever ways to use unpck and pack that need to be added. This is essential for real world shuffles as it turns out... Once SSE2 is polished a bit I should be able to get interesting numbers on performance improvements on benchmarks conducive to vectorization. All of this will be off by default until it is functionally equivalent of course. Differential Revision: http://reviews.llvm.org/D4225 llvm-svn: 211888
* [RuntimeDyld, PowerPC] Fix/improve handling of TOC relocationsUlrich Weigand2014-06-272-56/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current PPC64 RuntimeDyld code to handle TOC relocations has two problems: - With recent linkers, in addition to the relocations that implicitly refer to the TOC base (R_PPC64_TOC*), you can now also use the .TOC. magic symbol with any other relocation to refer to the TOC base explicitly. This isn't currently used much in ELFv1 code (although it could be), but it is essential in ELFv2 code. - In a complex JIT environment with multiple modules, each module may have its own .toc section, and TOC relocations in one module must refer to *its own* TOC section. The current findPPC64TOC implementation does not correctly implement this; in fact, it will always return the address of the first TOC section it finds anywhere. (Note that at the time findPPC64TOC is called, we don't even *know* which module the relocation originally resided in, so it is not even possible to fix this routine as-is.) This commit fixes both problems by handling TOC relocations earlier, in processRelocationRef. To do this, I've removed the findPPC64TOC routine and replaced it by a new routine findPPC64TOCSection, which works analogously to findOPDEntrySection in scanning the sections of the ObjImage provided by its caller, processRelocationRef. This solves the issue of finding the correct TOC section associated with the current module. This makes it straightforward to implement both R_PPC64_TOC relocations, and relocations explicitly refering to the .TOC. symbol, directly in processRelocationRef. There is now a new problem in implementing the R_PPC64_TOC16* relocations, because those can now in theory involve *three* different sections: the relocation may be applied in section A, refer explicitly to a symbol in section B, and refer implicitly to the TOC section C. The final processing of the relocation thus may only happen after all three of these sections have been assigned final addresses. There is currently no obvious means to implement this in its general form with the common-code RuntimeDyld infrastructure. Fortunately, ppc64 code usually makes no use of this most general form; in fact, TOC16 relocations are only ever generated by LLVM for symbols residing themselves in the TOC, which means "section B" == "section C" in the above terminology. This special case can easily be handled with the current infrastructure, and that is what this patch does. [ Unhandled cases result in an explicit error, unlike the current code which silently returns the wrong TOC base address ... ] This patch makes the JIT work on both BE and LE (ELFv2 requires additional patches, of course), and allowed me to successfully run complex JIT scenarios (via mesa/llvmpipe). Reviewed by Hal Finkel. llvm-svn: 211885
* IRReader: don't mark MemoryBuffers constAlp Toker2014-06-272-4/+3
| | | | llvm-svn: 211883
* Added instruction combine to transform few more negative values addition to ↵Dinesh Dwivedi2014-06-271-48/+62
| | | | | | | | | | | | subtraction (Part 3) This patch enables transforms for (x + (~(y | c) + 1) --> x - (y | c) if c is odd Differential Revision: http://reviews.llvm.org/D4210 llvm-svn: 211881
* Remove the caching of the target machine from SystemZTargetLowering.Eric Christopher2014-06-272-21/+33
| | | | | | Update all callers and uses accordingly. llvm-svn: 211880
* Remove target machine caching from SystemZInstrInfo andEric Christopher2014-06-275-20/+18
| | | | | | | | SystemZRegisterInfo and replace it with the subtarget as that's all they needed in the first place. Update all uses and calls accordingly. llvm-svn: 211877
* Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the ↵David Blaikie2014-06-273-34/+4
| | | | | | | | | | | caller has debug info but the call itself has no debug location.""" Reverting this again, didn't mean to commit it - while r211872 fixes one of the issues here, there are still others to figure out and address. This reverts commit r211871. llvm-svn: 211873
* ArgumentPromotion: Propagate debug locations on calls for which arguments ↵David Blaikie2014-06-271-0/+1
| | | | | | are promoted. llvm-svn: 211872
* Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has ↵David Blaikie2014-06-273-4/+34
| | | | | | | | debug info but the call itself has no debug location."" This reverts commit r211724. llvm-svn: 211871
* Have SystemZSelectionDAGInfo constructor take a DataLayout ratherEric Christopher2014-06-273-6/+4
| | | | | | | than a target machine since it doesn't need anything past the DataLayout. llvm-svn: 211870
* Rename getX86ConditonCode -> getX86ConditionCodeCraig Topper2014-06-271-5/+5
| | | | llvm-svn: 211869
* Left out the NDEBUG in the previous checkin.Andrew Trick2014-06-271-0/+2
| | | | llvm-svn: 211867
* MachineScheduler: add some book-keeping to fix an assert.Andrew Trick2014-06-271-1/+7
| | | | | | | | Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"' Thanks to Chad for the test case. llvm-svn: 211865
* Propagate const-correctness into parseBitcodeFile()Alp Toker2014-06-272-4/+4
| | | | llvm-svn: 211864
* Have MipsSelectionDAGInfo constructor take a DataLayout ratherEric Christopher2014-06-273-4/+4
| | | | | | | than a target machine since it doesn't need anything past the DataLayout. llvm-svn: 211863
* ParseIR: don't take ownership of the MemoryBufferAlp Toker2014-06-272-9/+11
| | | | | | | | | | clang was needlessly duplicating whole memory buffer contents in an attempt to satisfy unclear ownership semantics. Let's just hide internal LLVM quirks and present a simple non-owning interface. The public C API preserves previous behaviour for stability. llvm-svn: 211861
* Move NVPTX subtarget dependent variables from the target machineEric Christopher2014-06-275-49/+70
| | | | | | to the subtarget. llvm-svn: 211860
* Use the target lowering we can get off of the DAG rather than offEric Christopher2014-06-271-1/+1
| | | | | | of the cached target machine. llvm-svn: 211858
* Fix missing newline and simplify debug printing.Matt Arsenault2014-06-271-5/+5
| | | | llvm-svn: 211850
* R600: Move load/store ReplaceNodeResults to common code.Matt Arsenault2014-06-272-14/+14
| | | | | | Future patches will want to custom lower loads on SI. llvm-svn: 211848
* Move the constructor for NVPTXFrameLowering into the implementationEric Christopher2014-06-272-5/+6
| | | | | | file in preparation for the subtarget move. llvm-svn: 211847
* Remove unnecessary caching of the TargetMachine on NVPTXFrameLowering.Eric Christopher2014-06-273-14/+17
| | | | | | Adjust the constructor accordingly. llvm-svn: 211846
* Rework the logic for setting the TargetName. This appears toEric Christopher2014-06-271-11/+5
| | | | | | be shorter and identical in goal. llvm-svn: 211845
* Remove caching of the target machine in NVPTXInstrInfo andEric Christopher2014-06-273-5/+4
| | | | | | update constructor accordingly. llvm-svn: 211840
* Remove comment that duplicated information in the constructorEric Christopher2014-06-271-6/+6
| | | | | | that it's after. llvm-svn: 211839
* Remove commented out code.Eric Christopher2014-06-271-8/+0
| | | | llvm-svn: 211838
* Remove extraneous parens and extraneous const cast (and fix theEric Christopher2014-06-271-3/+3
| | | | | | prototype for the function to patch what we were returning). llvm-svn: 211837
* Move the subtarget dependent features from the target machine toEric Christopher2014-06-274-41/+53
| | | | | | the subtarget for the MSP430 target. llvm-svn: 211836
* Remove uses and caches of the target machine and subtarget fromEric Christopher2014-06-275-19/+8
| | | | | | | | both MSP430InstrInfo and MSP430RegisterInfo. Remove unused member variable StackAlign from MSP430RegisterInfo. Update constructors accordingly. llvm-svn: 211835
* Remove caching of an unused subtarget from MSP430FrameLowering.Eric Christopher2014-06-272-8/+3
| | | | llvm-svn: 211830
* [X86] AVX512: Add vbroadcasti*Adam Nemet2014-06-271-0/+22
| | | | | | | | | For now I used a separate template for these sub-vector/tuple broadcasts rather than sharing the mem variants with avx512_int_broadcast_rm. <rdar://problem/17402869> llvm-svn: 211828
* Remove unnecessary caching of variables by MSP430TargetLowering andEric Christopher2014-06-272-14/+5
| | | | | | | make the constructor more general since it only needs a target machine. llvm-svn: 211827
* Have MSP430SelectionDAGInfo constructor take a DataLayout ratherEric Christopher2014-06-273-4/+4
| | | | | | | than a target machine since it doesn't need anything past the DataLayout. llvm-svn: 211826
* Move all of the hexagon subtarget dependent variables from the targetEric Christopher2014-06-274-29/+46
| | | | | | machine to the subtarget. llvm-svn: 211824
* Have HexagonSelectionDAGInfo take a DataLayout rather than aEric Christopher2014-06-273-6/+4
| | | | | | target machine since that's all it needs. llvm-svn: 211822
* Make HexagonISelLowering not dependent upon a HexagonTargetMachine,Eric Christopher2014-06-272-21/+25
| | | | | | but a normal TargetMachine and remove a few cached uses. llvm-svn: 211821
* Reduce indentation.Eric Christopher2014-06-271-362/+360
| | | | llvm-svn: 211820
* Remove unnecessary caching of the subtarget for HexagonFrameLowering and ↵Eric Christopher2014-06-273-8/+4
| | | | | | remove the unused constructor argument. llvm-svn: 211819
* InstrItineraryData is already on the subtarget, no reason toEric Christopher2014-06-272-10/+6
| | | | | | cache it on the target as well. llvm-svn: 211818
* [StackMaps] Enable patchpoint liveness analysis per default.Juergen Ributzka2014-06-262-10/+8
| | | | llvm-svn: 211817
* [Stackmaps] Remove the liveness calculation for stackmap intrinsics.Juergen Ributzka2014-06-262-11/+5
| | | | | | | | | | There is no need to calculate the liveness information for stackmaps. The liveness information is still available for the patchpoint intrinsic and that is also the intended usage model. Related to <rdar://problem/17473725> llvm-svn: 211816
* [RuntimeDyld] Teach MachOObjectImage to deregister itself with the debugger uponLang Hames2014-06-261-1/+4
| | | | | | destruction the same way ELFObjectImage does. llvm-svn: 211815
* Revert "Introduce a string_ostream string builder facilty"Alp Toker2014-06-2643-159/+225
| | | | | | Temporarily back out commits r211749, r211752 and r211754. llvm-svn: 211814
* Move the various Subtarget dependent members down to the subtargetEric Christopher2014-06-264-65/+81
| | | | | | | | for the Sparc port. Use the same initializeSubtargetDependencies function to handle initialization similar to the other ports to handle dependencies. llvm-svn: 211811
* Have SparcSelectionDAGInfo take a DataLayout to initialize sinceEric Christopher2014-06-263-5/+5
| | | | | | that's all it needs. llvm-svn: 211810
* Remove the storage and use of the subtarget out of the sparc frameEric Christopher2014-06-262-9/+11
| | | | | | lowering code. llvm-svn: 211809
* GVN: Preserve invariant.load metadataArnold Schwaighofer2014-06-261-0/+4
| | | | | | | | | | | If both instructions to be replaced are marked invariant the resulting instruction is invariant. rdar://13358910 Fix by Erik Eckstein! llvm-svn: 211801
OpenPOWER on IntegriCloud