summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [Hexagon] Mark RangeTree::dump() with LLVM_DUMP_METHOD.Davide Italiano2017-10-141-1/+1
| | | | | | | GCC otherwise emits a "defined but not used" warning on the member function. llvm-svn: 315838
* AMDGPU: Don't use TargetStreamer if it has not been initializedKonstantin Zhuravlyov2017-10-142-10/+16
| | | | | | | | | | Fixes cfe/trunk/test/Misc/backend-resource-limit-diagnostics.cl test after r315808 We may hit few other similar issues, but I want to discuss good solution offline. llvm-svn: 315830
* [X86][SSE] Don't attempt to reduce the imul vector width of odd sized ↵Simon Pilgrim2017-10-141-1/+4
| | | | | | vectors (PR34947) llvm-svn: 315825
* Revert "[AArch64][RegisterBankInfo] Use the statically computed mappings for ↵Bruno Cardoso Lopes2017-10-141-32/+4
| | | | | | | | | COPY" This reverts commit r315781, breaks: http://green.lab.llvm.org/green/job/Compiler_Verifiers_GlobalISEL/9882 llvm-svn: 315823
* AMDGPU: Bring HSA metadata on par with the specificationKonstantin Zhuravlyov2017-10-145-94/+121
| | | | | | Differential Revision: https://reviews.llvm.org/D38753 llvm-svn: 315821
* Pull out repeated calls to VT.getVectorNumElements(). NFCI.Simon Pilgrim2017-10-141-10/+11
| | | | llvm-svn: 315818
* Use DAG::getBitcast() helper. NFCI.Simon Pilgrim2017-10-141-4/+4
| | | | llvm-svn: 315815
* AMDGPU: Improve note directive verification in assemblerKonstantin Zhuravlyov2017-10-141-1/+19
| | | | | | | | | | - Do not allow amd_amdgpu_isa directives on non-amdgcn architectures - Do not allow amd_amdgpu_hsa_metadata on non-amdhsa OSes - Do not allow amd_amdgpu_pal_metadata on non-amdpal OSes Differential Revision: https://reviews.llvm.org/D38750 llvm-svn: 315812
* AMDGPU: Do not emit deprecated notes for code object v3Konstantin Zhuravlyov2017-10-146-11/+40
| | | | | | Differential Revision: https://reviews.llvm.org/D38749 llvm-svn: 315810
* AMDGPU: Add support for isa version noteKonstantin Zhuravlyov2017-10-146-10/+97
| | | | | | | | | | - Emit NT_AMD_AMDGPU_ISA - Add assembler parsing for isa version directive - If isa version directive does not match command line arguments, then return error Differential Revision: https://reviews.llvm.org/D38748 llvm-svn: 315808
* [X86][SSE] Support combining AND(EXTRACT(SHUF(X)), C) -> EXTRACT(SHUF(X))Simon Pilgrim2017-10-141-0/+39
| | | | | | | | If we are applying a byte mask to a value extracted from a shuffle, see if we can combine the mask into shuffle. Fixes the last issue with PR22415 llvm-svn: 315807
* [X86] Add patterns for vzmovl+cvtpd2dq/cvttpd2dq with a load.Craig Topper2017-10-142-1/+19
| | | | llvm-svn: 315802
* [X86] Add AVX512 versions of VCVTPD2PS to load folding tables.Craig Topper2017-10-141-0/+3
| | | | llvm-svn: 315801
* [X86] Add patterns for vzmovl+cvtpd2ps with a load.Craig Topper2017-10-142-12/+24
| | | | llvm-svn: 315800
* [X86] Remove some patterns for bitcasted alignednonedtemporalloads.Craig Topper2017-10-141-18/+0
| | | | | | These select the same instruction as the non-bitcasted pattern. So this provides no additional value. llvm-svn: 315799
* [X86] Remove unnecessary bitconverts as the root of patterns for zero ↵Craig Topper2017-10-141-4/+4
| | | | | | | | extended VCVTPD2UDQZ128rr and VCVTTPD2UDQZ128rr. We don't need a bitconvert as a root pattern in these cases. The types in the other parts of the pattern are sufficient to express the behavior of these instructions. llvm-svn: 315798
* [X86] Add additional patterns for folding loads with 128-bit VCVTDQ2PD and ↵Craig Topper2017-10-141-0/+10
| | | | | | | | | | VCVTUDQ2PD. This matches the patterns we have for the SSE/AVX version. This is a prerequisite for D38714. llvm-svn: 315797
* [X86] Add AVX512 flavors of VCVTDQ2PD plus VCVTUDQ2PD to the load folding ↵Craig Topper2017-10-141-0/+6
| | | | | | tables. llvm-svn: 315796
* [X86] Remove TB_NO_REVERSE from VCVTDQ2PDYrr and VCVTPS2PDYrr in the load ↵Craig Topper2017-10-141-2/+2
| | | | | | | | folding tables. I believe these were added incorrectly under the belief that the load size was smaller than the input register size, but that's not true. llvm-svn: 315795
* [X86] Add an additional isel pattern to CVTDQ2PDrm/VCVTDQ2PDrm to enable ↵Craig Topper2017-10-141-2/+6
| | | | | | | | load folding without the peephole pass. This pattern is already used in AVX512VL version of these instructions. Though AVX512VL version is missing other patterns. llvm-svn: 315794
* Fix assembler for alloca of multiple elements in non-zero addr spaceYaxun Liu2017-10-141-5/+16
| | | | | | | | | | | | | Currently llvm assembler emits parsing error for valid IR assembly alloca i32, i32 9, addrspace(5) when alloca addr space is 5. This patch fixes that. Differential Revision: https://reviews.llvm.org/D38713 llvm-svn: 315791
* [AArch64][RegisterBankInfo] Use the statically computed mappings for COPYQuentin Colombet2017-10-141-4/+32
| | | | | | | | | | | | | | | | | | | | We use to resort on the generic implementation to get the mappings for COPYs. The generic implementation resorts on table lookup and dynamically allocated objects to get the valid mappings. Given we already know how to map G_BITCAST and have the static mappings for them, use that code path for COPY as well. This is much more efficient. Improve the compile time of RegBankSelect by up to 20%. Note: When we eventually generate all the mappings via TableGen, we wouldn't have to do that dance to shave compile time. The intent of this change was to make sure that moving to static structure really pays off. NFC. llvm-svn: 315781
* Revert r315763: "[Hexagon] Rangify some loops, NFC"Krzysztof Parzyszek2017-10-132-26/+44
| | | | | | Broke some builds (using libstdc++). llvm-svn: 315769
* [X86] Use X86ISD::VBROADCAST in place of v2f64 X86ISD::MOVDDUP when AVX2 is ↵Craig Topper2017-10-133-17/+27
| | | | | | | | | | | | | | | | available This is particularly important for AVX512VL where we are better able to recognize the VBROADCAST loads to fold with other operations. For AVX512VL we now use X86ISD::VBROADCAST for all of the patterns and remove the 128-bit X86ISD::VMOVDDUP. We may be able to use this for AVX1 as well which would allow us to remove more isel patterns. I also had to add X86ISD::VBROADCAST as a node to call combineShuffle for so that we treat it similar to X86ISD::MOVDDUP. Differential Revision: https://reviews.llvm.org/D38836 llvm-svn: 315768
* [Hexagon] Rangify some loops, NFCKrzysztof Parzyszek2017-10-132-44/+26
| | | | llvm-svn: 315763
* [InstCombine] use m_Neg() to reduce code; NFCISanjay Patel2017-10-131-13/+9
| | | | llvm-svn: 315762
* [globalisel][tablegen] Add support for fpimm and import of APInt/APFloat ↵Daniel Sanders2017-10-132-8/+13
| | | | | | | | | | | | | | | | | | | | | | based ImmLeaf. Summary: There's only a tablegen testcase for IntImmLeaf and not a CodeGen one because the relevant rules are rejected for other reasons at the moment. On AArch64, it's because there's an SDNodeXForm attached to the operand. On X86, it's because the rule either emits multiple instructions or has another predicate using PatFrag which cannot easily be supported at the same time. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36569 llvm-svn: 315761
* [Transforms] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-10-136-132/+320
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 315760
* [RegisterBankInfo] Cache the getMinimalPhysRegClass informationQuentin Colombet2017-10-132-8/+21
| | | | | | | | | | | | | | | TargetRegisterInfo::getMinimalPhysRegClass is actually pretty expensive because it has to iterate over all the register classes. Cache this information as we need and get it so that we limit its usage. Right now, we heavily rely on it, because this is how we get the mapping for vregs defined by copies from physreg (i.e., the one that are ABI related). Improve compile time by up to 10% for that pass. NFC llvm-svn: 315759
* [Legalizer] Use SmallSetVector instead of SetVector.Quentin Colombet2017-10-131-1/+1
| | | | | | NFC llvm-svn: 315758
* [LegalizerInfo] Don't evaluate end boundary every time through the loopQuentin Colombet2017-10-131-3/+4
| | | | | | | | Match the LLVM coding standard for loop conditions. NFC. llvm-svn: 315757
* [Legalizer] Only allocate the SetVectors once per function.Quentin Colombet2017-10-131-3/+5
| | | | | | | | | | | | Prior to this patch we used to create SetVectors in temporaries that were created and destroyed for each instruction. Now, instead we create and destroyed them only once, but clear the content for each instruction. This speeds up the pass by ~25%. NFC. llvm-svn: 315756
* AMDGPU: Implement hasBitPreservingFPLogicMatt Arsenault2017-10-132-0/+6
| | | | llvm-svn: 315754
* LowerTypeTests: Give imported symbols a type with size 0 so that they are ↵Peter Collingbourne2017-10-131-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | not assumed not to alias. It is possible for both a base and a derived class to be satisfied with a unique vtable. If a program contains casts of the same pointer to both of those types, the CFI checks will be lowered to this (with ThinLTO): if (p != &__typeid_base_global_addr) trap(); if (p != &__typeid_derived_global_addr) trap(); The optimizer may then use the first condition combined with the assumption that __typeid_base_global_addr and __typeid_derived_global_addr may not alias to optimize away the second comparison, resulting in an unconditional trap. This patch fixes the bug by giving imported globals the type [0 x i8]*, which prevents the optimizer from assuming that they do not alias. Differential Revision: https://reviews.llvm.org/D38873 llvm-svn: 315753
* [Hexagon] Avoid unused variable warnings in release builds.Benjamin Kramer2017-10-131-0/+4
| | | | | | No functionality change intended. llvm-svn: 315749
* AMDGPU: Look for src mods before fp_extendMatt Arsenault2017-10-131-1/+17
| | | | | | | When selecting modifiers for mad_mix instructions, look at fneg/fabs that occur before the conversion. llvm-svn: 315748
* [aarch64] Support APInt and APFloat in ImmLeaf subclasses and make AArch64 ↵Daniel Sanders2017-10-131-16/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | use them. Summary: The purpose of this patch is to expose more information about ImmLeaf-like PatLeaf's so that GlobalISel can learn to import them. Previously, ImmLeaf could only be used to test int64_t's produced by sign-extending an APInt. Other tests on immediates had to use the generic PatLeaf and extract the constant using C++. With this patch, tablegen will know how to generate predicates for APInt, and APFloat. This will allow it to 'do the right thing' for both SelectionDAG and GlobalISel which require different methods of extracting the immediate from the IR. This is NFC for SelectionDAG since the new code is equivalent to the previous code. It's also NFC for FastISel because FastIselShouldIgnore is 1 for the ImmLeaf subclasses. Enabling FastIselShouldIgnore == 0 for these new subclasses will require a significant re-factor of FastISel. For GlobalISel, it's currently NFC because the relevant code to import the affected rules is not yet present. This will be added in a later patch. Depends on D36086 Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Reviewed By: qcolombet Subscribers: bjope, aemerson, rengolin, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D36534 llvm-svn: 315747
* [SmallPtrSet] Add iterator epoch tracking.Benjamin Kramer2017-10-131-0/+1
| | | | | | This will detect invalid iterators when ABI breaking checks are enabled. llvm-svn: 315746
* [InstCombine] move code to remove repeated constant check; NFCISanjay Patel2017-10-131-8/+7
| | | | | | Also, consolidate tests for this fold in one place. llvm-svn: 315745
* AMDGPU: Implement isFPExtFoldableMatt Arsenault2017-10-132-0/+12
| | | | | | This helps match v_mad_mix* in some cases. llvm-svn: 315744
* [InstCombine] recycle adds for better efficiencySanjay Patel2017-10-131-26/+21
| | | | | | Also, clean up unnecessary matcher capture variable initializations. llvm-svn: 315743
* DAG: Add opcode and source type to isFPExtFreeMatt Arsenault2017-10-133-238/+257
| | | | | | | | This is only currently used for mad/fma transforms. This is the only case where it should be used for AMDGPU, so add an opcode to be sure. llvm-svn: 315740
* [Hexagon] Minimize number of repeated constant extendersKrzysztof Parzyszek2017-10-133-0/+1863
| | | | | | | | | | | | | | Each constant extender requires an extra instruction, which adds to the code size and also reduces the number of available slots in an instruction packet. In most cases, the value of a repeated constant extender could be loaded into a register, and the instructions using the extender could be replaced with their counterparts that use that register instead. This patch adds a pass that tries to reduce the number of constant extenders, including extenders which differ only in an immediate offset known at compile time, e.g. @global and @global+12. llvm-svn: 315735
* [InstCombine] use local var to reduce code duplication; NFCISanjay Patel2017-10-131-16/+15
| | | | llvm-svn: 315728
* [X86] Add initial skeleton support for knm cpuCraig Topper2017-10-132-5/+22
| | | | | | | | This adds Intel's Knights Mill CPU to valid CPU names for the backend. For now its an alias of "knl", but ultimately we need to support AVX5124FMAPS and AVX5124VNNIW instruction sets for it. Differential Revision: https://reviews.llvm.org/D38811 llvm-svn: 315722
* [IPSCCP] Move common functions to ValueLatticeUtils (NFC)Matthew Simpson2017-10-133-62/+72
| | | | | | | | | | | This patch moves some common utility functions out of IPSCCP and makes them available globally. The functions determine if interprocedural data-flow analyses can propagate information through function returns, arguments, and global variables. Differential Revision: https://reviews.llvm.org/D37638 llvm-svn: 315719
* [InstCombine] add hasOneUse check to add-zext-add fold to prevent increasing ↵Sanjay Patel2017-10-131-4/+2
| | | | | | instructions llvm-svn: 315718
* [SCEV] Maintain and use a loop->loop invalidation dependencySanjoy Das2017-10-131-31/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change uses the loop use list added in the previous change to remember the loops that appear in the trip count expressions of other loops; and uses it in forgetLoop. This lets us not scan every loop in the function on a forgetLoop call. With this change we no longer invalidate clear out backedge taken counts on forgetValue. I think this is fine -- the contract is that SCEV users must call forgetLoop(L) if their change to the IR could have changed the trip count of L; solely calling forgetValue on a value feeding into the backedge condition of L is not enough. Moreover, I don't think we can strengthen forgetValue to be sufficient for invalidating trip counts without significantly re-architecting SCEV. For instance, if we have the loop: I = *Ptr; E = I + 10; do { // ... } while (++I != E); then the backedge taken count of the loop is 9, and it has no reference to either I or E, i.e. there is no way in SCEV today to re-discover the dependency of the loop's trip count on E or I. So a SCEV client cannot change E to (say) "I + 20", call forgetValue(E) and expect the loop's trip count to be updated. Reviewers: atrick, sunfish, mkazantsev Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38435 llvm-svn: 315713
* [InstCombine] use AddOne helper to reduce code; NFCSanjay Patel2017-10-131-6/+3
| | | | llvm-svn: 315709
* [InstCombine] rearrange code to remove repeated constant check; NFCISanjay Patel2017-10-132-7/+7
| | | | llvm-svn: 315703
OpenPOWER on IntegriCloud