summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [SCCP] Remove code in visitBinaryOperator (and add tests).Davide Italiano2016-11-221-11/+3
| | | | | | | | | | We visit and/or, we try to derive a lattice value for the instruction even if one of the operands is overdefined. If the non-overdefined value is still 'unknown' just return and wait for ResolvedUndefsIn to "plug in" the correct value. This simplifies the logic a bit. While I'm here add tests for missing cases. llvm-svn: 287709
* TargetSubtargetInfo: Move implementation to lib/CodeGen; NFCMatthias Braun2016-11-223-2/+2
| | | | | | | | | | | | TargetSubtargetInfo is filled with CodeGen specific interfaces nowadays (getInstrInfo(), getFrameLowering(), getSelectionDAGInfo()) most of the tuning flags like enablePostRAScheduler(), getAntiDepBreakMode(), enableRALocalReassignment(), ... also do not seem to be universal enough to make sense outside of CodeGen. Differential Revision: https://reviews.llvm.org/D26948 llvm-svn: 287708
* [InstCombine] change bitwise logic type to eliminate bitcastsSanjay Patel2016-11-221-0/+43
| | | | | | | | | | | | | | | | | | | | In PR27925: https://llvm.org/bugs/show_bug.cgi?id=27925 ...we proposed adding this fold to eliminate a bitcast. In D20774, there was some concern about changing the type of a bitwise op as well as creating bitcasts that might not be free for a target. However, if we're strictly eliminating an instruction (by limiting this to one-use ops), then we should be able to do this in InstCombine. But we're cautiously restricting the transform for now to vector types to avoid possible backend problems. A transform to make sure the logic op is legal for the target should be added to reverse this transform and improve codegen. Differential Revision: https://reviews.llvm.org/D26641 llvm-svn: 287707
* [LCG] Add a previously missing assert about the relationship of RefSCCs.Chandler Carruth2016-11-221-0/+7
| | | | | | No intended change, everything seems to be in working order already. llvm-svn: 287705
* Fixed the lost FastMathFlags in GVN(Global Value Numbering).Vyacheslav Klochkov2016-11-221-1/+6
| | | | | | | Reviewer: Hal Finkel. Differential Revision: https://reviews.llvm.org/D26952 llvm-svn: 287700
* Remove PDBFileBuilder::build() and related functions.Rui Ueyama2016-11-224-103/+0
| | | | | | | | | | | | | | | | | | | PDBFileBuilder supports two different ways to create files. One is PDBFileBuilder::commit. That function takes a filename and write a result to the file. The other is PDBFileBuilder::build. That returns a new PDBFile object. This patch removes the latter because no one is using it and in a real life situation we are very unlikely to need it. Even if you need it, it'd be easy to write a new PDB to a memory buffer and read it back. Removing PDBFileBuilder::build enables us to remove other classes build transitively. Differential Revision: https://reviews.llvm.org/D26987 llvm-svn: 287697
* Fixed the lost FastMathFlags in Reassociate optimization.Vyacheslav Klochkov2016-11-221-0/+6
| | | | | | | Reviewer: Hal Finkel. Differential Revision: https://reviews.llvm.org/D26957 llvm-svn: 287695
* Restructure DwarfDebug::beginInstruction(). [NFC]Paul Robinson2016-11-221-21/+26
| | | | | | | | Will help a pending patch. Differential Revision: http://reviews.llvm.org/D26982 llvm-svn: 287686
* [Triple] Add Facebook vendorShoaib Meenai2016-11-221-0/+2
| | | | | | | | | Add a compiler vendor for Facebook, to enable future vendor-specific behavior. Differential Revision: https://reviews.llvm.org/D25136 llvm-svn: 287684
* [LCG] Add utilities to compute parent and ascestor relationships betweenChandler Carruth2016-11-221-0/+58
| | | | | | | | | | | | | | | | | SCCs. These will be fairly expensive routines to call and might be abused in real code, but are quite useful when debugging or in asserts and are reasonable and well formed properties to query. I've used one of them in an assert that was requested in a code review here. In subsequent commits I'll start using these routines more heavily, for example in unittests etc. But this at least gets the groundwork in place. Differential Revision: https://reviews.llvm.org/D25506 llvm-svn: 287682
* [mips] seb, seh instruction aliasesSimon Dardis2016-11-223-0/+12
| | | | | | | | | | Add the single operand form. Reviewers: vkalintiris Differential Revision: https://reviews.llvm.org/D26961 llvm-svn: 287681
* [PowerPC] Emit VMX loads/stores for aligned ops to avoid adding swaps on LENemanja Ivanovic2016-11-222-6/+23
| | | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D26861 It also fixes PR30730. Committing on behalf of Lei Huang. llvm-svn: 287679
* [X86][SSE] Combine UNPCKL(FHADD,FHADD) -> FHADD for v2f64 shuffles.Simon Pilgrim2016-11-221-3/+12
| | | | | | | | This occurs during UINT_TO_FP v2f64 lowering. We can easily generalize this to other horizontal ops (FHSUB, PACKSS, PACKUS) as required - we are doing something similar with PACKUS in lowerV2I64VectorShuffle llvm-svn: 287676
* [mips] Add support for unaligned load/store macros.Vasileios Kalintiris2016-11-222-84/+111
| | | | | | | | Add missing unaligned store macros (ush/usw) and fix the exisiting implementation of the unaligned load macros in order to generate identical expansions with the GNU assembler. llvm-svn: 287646
* CodeGen: simplify TargetMachine::getSymbol interface. NFC.Tim Northover2016-11-228-24/+23
| | | | | | | | | No-one actually had a mangler handy when calling this function, and getSymbol itself went most of the way towards getting its own mangler (with a local TLOF variable) so forcing all callers to supply one was just extra complication. llvm-svn: 287645
* [X86] Change lowerBuildVectorToBitOp() to take a BuildVectorSDNode. NFC.Zvi Rackover2016-11-221-5/+6
| | | | llvm-svn: 287644
* [X86] Remove dead code from LowerVectorBroadcastZvi Rackover2016-11-221-73/+18
| | | | | | | | | | | | Summary: Splat vectors are canonicalized to BUILD_VECTOR's so the code can be simplified. NFC-ish. Reviewers: craig.topper, delena, RKSimon, andreadb Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D26678 llvm-svn: 287643
* [AArch64] Set the max interleave factor for Falkor.Chad Rosier2016-11-221-1/+3
| | | | llvm-svn: 287642
* [AArch64] Maximize 80-column. NFC.Chad Rosier2016-11-221-8/+5
| | | | llvm-svn: 287640
* [SelectionDAG] ComputeNumSignBits of TRUNCATE operationsSimon Pilgrim2016-11-221-3/+7
| | | | | | | | | | Add basic ComputeNumSignBits support for TRUNCATE ops for cases where the source's number of sign bits overlaps with the truncated size. Improves X86 SIGN_EXTEND_IN_REG vector cases which were needlessly sign extending boolean vector results. Differential Revision: https://reviews.llvm.org/D26851 llvm-svn: 287635
* [AVX512][inline-asm] Fix AVX512 inline assembly instruction resolution when ↵Coby Tayree2016-11-221-3/+58
| | | | | | | | | | | | | | | | | | | | | | | | the size qualifier of a memory operand is not specified explicitly. This commit handles cases where the size qualifier of an indirect memory reference operand in Intel syntax is missing (e.g. "vaddps xmm1, xmm2, [a]"). GCC will deduce the size qualifier for AVX512 vector and broadcast memory operands based on the possible matches: "vaddps xmm1, xmm2, [a]" matches only “XMMWORD PTR” qualifier. "vaddps xmm1, xmm2, [a]{1to4}" matches only “DWORD PTR” qualifier. This is different from the current behavior of LLVM, which deduces the size qualifier based on the size of the memory operand. For "vaddps xmm1, xmm2, [a]" "char a;" will imply "BYTE PTR" qualifier "short a;" will imply "WORD PTR" qualifier. This commit aligns LLVM to GCC’s behavior. This is the LLVM part of the review. The Clang part of the review: https://reviews.llvm.org/D26587 Differential Revision: https://reviews.llvm.org/D26586 llvm-svn: 287630
* Rename option to -lto-pass-remarks-outputAdam Nemet2016-11-221-1/+1
| | | | | | | The new option -pass-remarks-output broke LLVM_LINK_LLVM_DYLIB because of the duplicate option name with opt. llvm-svn: 287627
* [X86] Remove alternate CodeGenOnly version of (v)movq that declared the load ↵Craig Topper2016-11-224-37/+11
| | | | | | | | | | size as i128mem. Change all uses to the use the i64mem version. I'm sure this caused the load size to misprint in Intel syntax output. We were also inconsistent about which patterns used which instruction between VEX and EVEX. There are two different reg/reg versions of movq, one from a GPR and one from the lower 64-bits of an XMM register. This changes the loading folding table to use the single i64mem memory form for folding both cases. But we need to use TB_NO_REVERSE to prevent a duplicate entry in the unfolding table. llvm-svn: 287622
* [AVX-512] Add support for commuting VPERMT2(B/W/D/Q/PS/PD) to/from ↵Craig Topper2016-11-222-15/+124
| | | | | | | | | | | | | | | | | VPERMI2(B/W/D/Q/PS/PD). Summary: The index and one of the table operands can be swapped by changing the opcode to the other version. Neither of these operands are the one that can load from memory so this can't be used to increase memory folding opportunities. We need to handle the unmasked forms and the kz forms. Since the load operand isn't being commuted we can commute the load and broadcast instructions too. Reviewers: igorb, delena, Ayal, Farhana, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25652 llvm-svn: 287621
* MC: ensure that we have a section before accessing itSaleem Abdulrasool2016-11-221-15/+17
| | | | | | | | | | | | We would attempt to access the symbol section without ensuring that the symbol was not absolute. When the assembler referenced relocation is not evaluated to the absolute, but when we record the relocation, we would query the section. Because the symbol is absolute, it does not have a section associated with it, triggering an assertion. Just be more careful about the access of the section. Addresses PR31064! llvm-svn: 287619
* [AVX-512] Add support for changing the element size of ↵Craig Topper2016-11-221-0/+63
| | | | | | | | | | | | | | | | | | | PALIGNR/VALIGND/VALIGNQ shuffles if they feed a vselect with a different type Summary: Shuffle lowering widens the element size of a shuffle if elements are contiguous. This is sometimes help because wider element types have more shuffle options. If the shuffle is one of the arguments to a vselect this shuffle widening can introduce a bitcast between the vselect and the shuffle. This will prevent isel from selecting a masked operation. If the shuffle can be written equally efficiently with a different element size to match the vselect type we should change the shuffle type to allow masking. This patch does this conversion for all VALIGND/VALIGNQ sizes. It also supports turning 128-bit PALIGNR into VALIGND/VALIGNQ. This fixes the case shown in PR31018. I plan to add support for more operations in future patches. Reviewers: RKSimon, zvi, delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26902 llvm-svn: 287612
* Object: Make SymbolicFile::symbol_{begin,end}() virtual and remove ↵Peter Collingbourne2016-11-223-6/+6
| | | | | | unnecessary wrappers. llvm-svn: 287611
* [AMDGPU] Fix multiple vreg definitions in si-lower-control-flowStanislav Mekhanoshin2016-11-221-7/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D26939 llvm-svn: 287608
* Analysis: gep inbounds (gep inbounds (...)) is inbounds.Peter Collingbourne2016-11-221-2/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D26441 llvm-svn: 287604
* DAG: Ignore call site attributes when emitting target intrinsicMatt Arsenault2016-11-211-2/+6
| | | | | | | | | | A target intrinsic may be defined as possibly reading memory, but the call site may have additional knowledge that it doesn't read memory. The intrinsic lowering will expect the pessimistic assumption of the intrinsic definition, so the chain should still be used. llvm-svn: 287593
* [AArch64LoadStoreOptimizer] Don't treat write to XZR/WZR as a clobber.Geoff Berry2016-11-211-2/+4
| | | | | | | | | | | | | | | Summary: When searching for load/store instructions to pair/merge don't treat writes to WZR/XZR as clobbers since they don't change the value read from WZR/XZR (which is always 0). Reviewers: mcrosier, junbuml, jmolloy, t.p.northover Subscribers: aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26921 llvm-svn: 287592
* [CodeGenPrepare] Don't sink non-cheap addrspacecasts.Justin Lebar2016-11-211-0/+8
| | | | | | | | | | | | | | | | | | | | Summary: Previously, CGP would unconditionally sink addrspacecast instructions, even going so far as to sink them into a loop. Now we check that the cast is "cheap", as defined by TLI. We introduce a new "is-cheap" function to TLI rather than using isNopAddrSpaceCast because some GPU platforms want the ability to ask for non-nop casts to be sunk. Reviewers: arsenm, tra Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26923 llvm-svn: 287591
* [CodeGenPrepare] Rewrite a loop in terms of llvm::none_of. NFC.Justin Lebar2016-11-211-11/+3
| | | | | | | | | | Reviewers: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D26924 llvm-svn: 287590
* [LoopReroll] Make root-finding more aggressive.Eli Friedman2016-11-211-50/+58
| | | | | | | | | | Allow using an instruction other than a mul or phi as the base for root-finding. For example, the included testcase includes a loop which requires using a getelementptr as the base for root-finding. Differential Revision: https://reviews.llvm.org/D26529 llvm-svn: 287588
* [InstCombine] canonicalize min/max constant to select's false valueSanjay Patel2016-11-211-0/+42
| | | | | | | | | | | | | | | | | | | | This is a first step towards canonicalization and improved folding/codegen for integer min/max as discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html Here, we're just matching the simplest min/max patterns and adjusting the icmp predicate while swapping the select operands. I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll so it's easier to see how this might be extended (corresponds to the TODO comment in the code). That's also why I'm using matchSelectPattern() rather than a simpler check; once the backend is patched, we can just remove some of the restrictions to allow the obfuscated min/max patterns in the FIXME tests to be matched. Differential Revision: https://reviews.llvm.org/D26525 llvm-svn: 287585
* LSR debug fix.Evgeny Stupachenko2016-11-211-1/+1
| | | | | | | | | | | Summary: Dump instruction instead of address. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D26877 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 287584
* fix formatting; NFCSanjay Patel2016-11-211-1/+0
| | | | llvm-svn: 287582
* [asan] Make ASan compatible with linker dead stripping on WindowsReid Kleckner2016-11-211-47/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is similar to what was done for Darwin in rL264645 / http://reviews.llvm.org/D16737, but it uses COFF COMDATs to achive the same result instead of relying on new custom linker features. As on MachO, this creates one metadata global per instrumented global. The metadata global is placed in the custom .ASAN$GL section, which the ASan runtime will iterate over during initialization. There are no other references to the metadata, so normal linker dead stripping would discard it. However, the metadata is put in a COMDAT group with the instrumented global, so that it will be discarded if and only if the instrumented global is discarded. I didn't update the ASan ABI version check since this doesn't affect non-Windows platforms, and the WinASan ABI isn't really stable yet. Implementing this for ELF will require extending LLVM IR and MC a bit so that we can use non-COMDAT section groups. Reviewers: pcc, kcc, mehdi_amini, kubabrecka Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26770 llvm-svn: 287576
* [mips] seq macro supportSimon Dardis2016-11-212-0/+110
| | | | | | | | | | | | | | This patch adds the seq macro. This partially resolves PR/30381. Thanks to Sean Bruno for reporting the issue! Reviewers: zoran.jovanovic, vkalintiris, seanbruno Differential Revision: https://reviews.llvm.org/D24607 llvm-svn: 287573
* Check proper live range in extendPHIRangesKrzysztof Parzyszek2016-11-212-7/+17
| | | | | | | | | | | | | The function extendPHIRanges checks the main range of the original live interval, even when dealing with a subrange. This could also lead to an assert when the subrange is not live at the extension point, but the main range is. To avoid this, check the corresponding subrange of the original live range, instead of always checking the main range. Review (as a part of a bigger set of changes): https://reviews.llvm.org/D26359 llvm-svn: 287571
* [TLI] Fix breakage introduced by D21739.Marcin Koscielnicki2016-11-211-19/+19
| | | | | | | | | The initialize function has an early return for AMDGPU targets. If taken, the ShouldExtI32* initialization code will not be executed, resulting in invalid values in the corresponding fields. Fix this by moving the code to the top of the function. llvm-svn: 287570
* [AsmPrinter] Enable codeview for windows-itaniumShoaib Meenai2016-11-211-1/+2
| | | | | | | | | | Enable codeview emission for windows-itanium targets. Co-opt an existing test (which is derived from a C source file and should therefore be identical across the Itanium and MS ABIs). Differential Revision: https://reviews.llvm.org/D26693 llvm-svn: 287567
* [MemorySSA] Fix for non-determinism in codegenMandeep Singh Grang2016-11-211-2/+11
| | | | | | | | | | | | | | | | | | | | This patch fixes the non-determinism caused due to iterating SmallPtrSet's which was uncovered due to the experimental "reverse iteration order " patch: https://reviews.llvm.org/D26718 The following unit tests failed because of the undefined order of iteration. LLVM :: Transforms/Util/MemorySSA/cyclicphi.ll LLVM :: Transforms/Util/MemorySSA/many-dom-backedge.ll LLVM :: Transforms/Util/MemorySSA/many-doms.ll LLVM :: Transforms/Util/MemorySSA/phi-translation.ll Reviewers: dberlin, mgrang Subscribers: dberlin, llvm-commits, david2050 Differential Revision: https://reviews.llvm.org/D26704 llvm-svn: 287563
* [VectorLegalizer] Remove EVT::getSizeInBits code duplications. NFCI.Simon Pilgrim2016-11-211-8/+6
| | | | | | We were calling SVT.getSizeInBits() several times in a row - just call it once and reuse the result. llvm-svn: 287556
* [CodeGenPrep] Skip merging empty case blocksJun Bum Lim2016-11-211-32/+135
| | | | | | | | | | | | | | | | Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, davidxl Subscribers: qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 287553
* small fixup which enables the issuing of the aforementioned instruction (w/o ↵Coby Tayree2016-11-211-1/+1
| | | | | | | | operands), on MS/Intel syntax. Differential Revision: https://reviews.llvm.org/D26913 llvm-svn: 287548
* Fix known zero bits for addrspacecast.Yaxun Liu2016-11-211-1/+0
| | | | | | | | | | Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1. This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand. Differential Revision: https://reviews.llvm.org/D26803 llvm-svn: 287545
* [SelectionDAG] Add ComputeNumSignBits support for CONCAT_VECTORS opcodeSimon Pilgrim2016-11-211-0/+7
| | | | llvm-svn: 287541
* [X86][SSE] Allow PACKSS to be used to truncate any type of all/none sign ↵Simon Pilgrim2016-11-211-16/+16
| | | | | | | | | | bits input At the moment we only use truncateVectorCompareWithPACKSS with direct vector comparison results (just one example of a known all/none signbits input). This change relaxes the direct matching of a SETCC opcode by moving the logic up into SelectionDAG::ComputeNumSignBits and accepting any input with a known splatted signbit. llvm-svn: 287535
* [InstrProfiling] Mark __llvm_profile_instrument_target last parameter as i32 ↵Marcin Koscielnicki2016-11-211-11/+30
| | | | | | | | | | | | | | | | | zeroext if appropriate. On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed as i32 signext instead of plain i32. Likewise, unsigned int may be passed as i32, i32 signext, or i32 zeroext depending on the platform. Mark __llvm_profile_instrument_target properly (its last parameter is unsigned int). This (together with the clang change) makes compiler-rt profile testsuite pass on s390x. Differential Revision: http://reviews.llvm.org/D21736 llvm-svn: 287534
OpenPOWER on IntegriCloud