summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] In combineMOVMSK, look through int->fp bitcasts before callling ↵Craig Topper2018-09-111-1/+7
| | | | | | | | SimplifyDemandedBits. MOVMSKPS and MOVMSKPD both take FP types, but likely the operations before it are on integer types with just a int->fp bitcast between them. If the bitcast isn't used by anything else and doesn't change the element width we can look through it to simplify the integer ops. llvm-svn: 341915
* NFC: use bit_cast more in AArch64AddressingModesJF Bastien2018-09-111-24/+11
| | | | | | | | The was previously committed as r341749 then reverted as r341750 because bit_cast needed to do its own thing to check is_trivially_copyable on GCC 4.x. This is now done and std;:array should now get accepted. llvm-svn: 341897
* AMDGPU: Remove leftovers from configurable address spacesMatt Arsenault2018-09-112-34/+12
| | | | llvm-svn: 341895
* Move FeatureAES from SLM, WSM and SNB to GLM and SKLErich Keane2018-09-101-3/+1
| | | | | | | | | | | | | Complements https://reviews.llvm.org/D51510 and matches https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01940.html GoldmontProc already has FeatureAES. Patch By: thiagomacieira Differential Revision: https://reviews.llvm.org/D51565 llvm-svn: 341861
* [X86] Mark the ISD::SETLT/SETLE condition codes as illegal for v32i16/v64i8 ↵Craig Topper2018-09-101-0/+5
| | | | | | | | to match the other vector types. I'm having a hard time finding a test case for this, but we should be consistent here. The fact that we canonicalize all zeros and all ones constants to vXi32 and all other constants to loads makes this hard to hit the easy DAG combine infinite loop we get for some of the other types. llvm-svn: 341859
* [Hexagon] Split large offsets into properly aligned addendsKrzysztof Parzyszek2018-09-101-0/+9
| | | | llvm-svn: 341851
* [ARC] Fix macro usage (DEBUG -> LLVM_DEBUG)Tatyana Krasnukha2018-09-101-1/+1
| | | | llvm-svn: 341844
* [AMDGPU] Preliminary patch for divergence driven instruction selection. ↵Alexander Timofeev2018-09-101-5/+33
| | | | | | | | | | Inline immediate move to V_MADAK_F32. Differential revision: https://reviews.llvm.org/D51586 Reviewer: rampitec llvm-svn: 341843
* [MIPS GlobalISel] Select icmpPetar Jovanovic2018-09-103-0/+89
| | | | | | | | | | Select 32bit integer compare instructions for MIPS32. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D51489 llvm-svn: 341840
* [Sparc] Move SparcTargetStreamer.h to the MC Desc, where the implementation ↵Benjamin Kramer2018-09-102-4/+3
| | | | | | is already llvm-svn: 341826
* [Target] Untangle disassemblersBenjamin Kramer2018-09-1015-43/+37
| | | | | | | Disassemblers cannot depend on main target headers. The same is true for MCTargetDesc, but there's a lot more cleanup needed for that. llvm-svn: 341822
* Don't create a temporary vector of loop blocks just to iterate over them.Benjamin Kramer2018-09-101-8/+6
| | | | | | Loop's getBlocks returns an ArrayRef. llvm-svn: 341821
* AMDGPU: Remove function pointer type hackMatt Arsenault2018-09-101-7/+4
| | | | | | | Now the pointer size should always be correct and we don't need to improperly inspect the pointee type. llvm-svn: 341806
* AMDGPU: Stop reporting is-noop addrspacecast for constant 32-bitMatt Arsenault2018-09-101-2/+1
| | | | | | | This will require something to cast. Before this would eliminate the cast, which would result in copies of $noreg. llvm-svn: 341803
* DAG: Handle odd vector sizes in calling conv splittingMatt Arsenault2018-09-101-8/+5
| | | | | | | | | | | | | | This already worked if only one register piece was used, but didn't if a type was split into multiple, unequal sized pieces. Fixes not splitting 3i16/v3f16 into two registers for AMDGPU. This will also allow fixing the ABI for 16-bit vectors in a future commit so that it's the same for all subtargets. llvm-svn: 341801
* [AMDGPU] Prevent sequences of non-instructions disrupting ↵Carl Ritson2018-09-101-2/+9
| | | | | | | | | | | | | | | | | | GCNHazardRecognizer wait state counting Summary: This fixes a bug where a large number of implicit def instructions can fill the GCNHazardRecognizer lookahead buffer causing required NOPs to not be inserted. Reviewers: nhaehnle, arsenm Reviewed By: arsenm Subscribers: sheredom, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51726 Change-Id: Ie75338f94de704ee5816b05afd0c922c6748a95b llvm-svn: 341798
* AMDGPU: Use GOT PSV since it has an address space nowMatt Arsenault2018-09-101-2/+2
| | | | llvm-svn: 341768
* AMDGPU: Don't abort on unknown addrspace argumentMatt Arsenault2018-09-101-8/+10
| | | | llvm-svn: 341767
* [X86] Custom type legalize (v2i32 (fp_to_uint v2f64))) without avx512vl by ↵Craig Topper2018-09-091-12/+13
| | | | | | | | | | widening to v4i32 and v4f64 instead of v8i32 and v8f64. Make it aware of x86-experimental-vector-widening-legalization We have isel patterns for v4i32/v4f64 that artificially widen to v8i32/v8f64 so just use that. If x86-experimental-vector-widening-legalization is enabled, we don't need any custom legalization and can just return. I've modified the test RUN lines to cover this case. llvm-svn: 341765
* [X86] Create paddus/psubus from narrower vectors with i8/i16 element types.Craig Topper2018-09-081-8/+12
| | | | | | | | | | | | | | | | | Summary: This patch allows vectors with a power of 2 number of elements and i8/i16 element type to select paddus/psubus instructions. ReplaceNodeResults has been updated to custom widen these operations up to 128 bits like we already do for PAVG. Another step towards fixing PR38691 Reviewers: RKSimon, spatel Reviewed By: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51818 llvm-svn: 341753
* [X86] Mark the ADCX and ADOX instruction as commutable.Craig Topper2018-09-081-1/+1
| | | | llvm-svn: 341752
* Revert "NFC: use bit_cast more in AArch64AddressingModes"JF Bastien2018-09-081-11/+24
| | | | | | | It seems some bots think std::array is either not trivially-copyable, or isn't the right size. llvm-svn: 341750
* NFC: use bit_cast more in AArch64AddressingModesJF Bastien2018-09-081-24/+11
| | | | llvm-svn: 341749
* [X86] Add commuted isel pattern for the load form of ADCX instructions.Craig Topper2018-09-081-0/+9
| | | | | | This prevents the legacy ADC instruction from being favored over ADCX when the load is in the operand 0. llvm-svn: 341745
* Fix typo in previous commitJF Bastien2018-09-081-1/+1
| | | | llvm-svn: 341742
* ADT: add <bit> header, implement C++20 bit_cast, useJF Bastien2018-09-082-25/+18
| | | | | | | | | | | | | | Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning. This was originally committed as r341728 and reverted in r341730. Reviewers: javed.absar, steven_wu, srhines Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51693 llvm-svn: 341741
* Revert "ADT: add <bit> header, implement C++20 bit_cast, use"JF Bastien2018-09-072-17/+25
| | | | | | Bots sad. Looks like missing std::is_trivially_copyable. llvm-svn: 341730
* ADT: add <bit> header, implement C++20 bit_cast, useJF Bastien2018-09-072-25/+17
| | | | | | | | | | | | Summary: I saw a few places that were punning through a union of FP and integer, and that made me sad. Luckily, C++20 adds bit_cast for exactly that purpose. Implement our own version in ADT (without constexpr, leaving us a bit sad), and use it in the few places my grep-fu found silly union punning. Reviewers: javed.absar Subscribers: dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51693 llvm-svn: 341728
* [WebAssembly] v8x16.shuffleThomas Lively2018-09-074-0/+95
| | | | | | | | | | | | | | | | | | Summary: Since the shuffle mask is not exposed as an operand in the native ISel DAG, create a new WebAssembly ISD node exposing the mask. The mask is lowered as sixteen immediate byte indices no matter what type the original vector shuffle was operating on. This CL depends on D51656 Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51659 llvm-svn: 341718
* [WebAssembly] Change SIMD lane indices to vec_i8imm_opThomas Lively2018-09-071-4/+4
| | | | | | | | | | | | Summary: To explicitly opt out of LEB encoding for these immediates. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51766 llvm-svn: 341707
* [AArch64] Support reserving x1-7 registers.Nick Desaulniers2018-09-079-52/+79
| | | | | | | | | | | | | | | Summary: Reserving registers x1-7 is used to support CONFIG_ARM64_LSE_ATOMICS in Linux kernel. This change adds support for reserving registers x1 through x7. Reviewers: javed.absar, phosek, srhines, nickdesaulniers, efriedma Reviewed By: nickdesaulniers, efriedma Subscribers: niravd, jfb, manojgupta, nickdesaulniers, jyknight, efriedma, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D48580 llvm-svn: 341706
* [X86] Don't create ZERO_EXTEND_INREG/SIGN_EXTEND_INREG for v1iX vectors.Craig Topper2018-09-071-1/+1
| | | | | | The generic type legalizer will scalarize vXi1 instructions getting rid of the vector entirely. Creating wider vector instructions is just going to prevent that. llvm-svn: 341705
* [X86] Don't create X86ISD::AVG nodes from v1iX vectors.Craig Topper2018-09-071-1/+1
| | | | | | | | The type legalizer will try to scalarize this and fail. It looks like there's some other v1iX oddities out there too since we still generated some vector instructions. llvm-svn: 341704
* [X86] Modify the the rdtscp intrinsic to return values instead of taking a ↵Craig Topper2018-09-071-18/+18
| | | | | | | | | | pointer argument Similar to what was recently done for addcarry/subborrow and has been done for rdrand/rdseed for a while. It's better to use two results and an explicit store in IR when the store isn't part of the semantics of the instruction. This allows store->load forwarding to happen in the middle end. Or the store to be removed if its never loaded. Differential Revision: https://reviews.llvm.org/D51803 llvm-svn: 341698
* [RISCV] Fix crash in decoding instruction with unknown floating point ↵Ana Pazos2018-09-073-1/+28
| | | | | | | | | | | | | | | | | | | | rounding mode Summary: Instead of crashing in printFRMArg, decode and warn about invalid instruction. This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D51705 llvm-svn: 341691
* [RISCV] Fix AddressSanitizer heap-buffer-overflow in disassemblingAna Pazos2018-09-071-0/+8
| | | | | | | | | | | | | | | | | | Summary: RISCVDisassembler should check number of bytes available before reading them. Crash noticed when enabling -DLLVM_USE_SANITIZER=Address. This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D51708 llvm-svn: 341686
* [X86] Change the addcarry and subborrow intrinsics to return 2 results and ↵Craig Topper2018-09-072-22/+18
| | | | | | | | | | remove the pointer argument. We should represent the store directly in IR instead. This gives the middle end a chance to remove it if it can see a load from the same address. Differential Revision: https://reviews.llvm.org/D51769 llvm-svn: 341677
* [X86] Teach X86DAGToDAGISel::foldLoadStoreIntoMemOperand to handle loads in ↵Craig Topper2018-09-071-7/+21
| | | | | | | | | | operand 1 of commutable operations. Previously we only handled loads in operand 0, but nothing guarantees the load will be operand 0 for commutable operations. Differential Revision: https://reviews.llvm.org/D51768 llvm-svn: 341675
* Add support for getRegisterByName.Sid Manning2018-09-072-0/+16
| | | | | | | | Support required to build the Hexagon Linux kernel. Differential Revision: https://reviews.llvm.org/D51363 llvm-svn: 341658
* ARM: fix Thumb2 CodeGen for ldrex with folded frame-index.Tim Northover2018-09-075-3/+12
| | | | | | | | | | | Because t2LDREX (& t2STREX) were marked as AddrModeNone, but did allow a FrameIndex operand, rewriteT2FrameIndex asserted. This gives them a proper addressing-mode and tells the rewriter about it so that encodable offsets are exploited and others are rejected. Should fix PR38828. llvm-svn: 341642
* [AMDGPU] Preliminary patch for divergence driven instruction selection. Fold ↵Alexander Timofeev2018-09-071-3/+11
| | | | | | | | | immediate SMRD offset. Differential revision: https://reviews.llvm.org/D51610 Reviewer: rampitec llvm-svn: 341636
* [PowerPC] Combine ADD to ADDZEQingShan Zhang2018-09-072-0/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | On the ppc64le platform, if ir has the following form, define i64 @addze1(i64 %x, i64 %z) local_unnamed_addr #0 { entry: %cmp = icmp ne i64 %z, CONSTANT (-32767 <= CONSTANT <= 32768) %conv1 = zext i1 %cmp to i64 %add = add nsw i64 %conv1, %x ret i64 %add } we can optimize it to the form below. when C == 0 --> addze X, (addic Z, -1)) / add X, (zext(setne Z, C))-- \ when -32768 <= -C <= 32767 && C != 0 --> addze X, (addic (addi Z, -C), -1) Patch By: HLJ2009 (Li Jia He) Differential Revision: https://reviews.llvm.org/D51403 Reviewed By: Nemanjai llvm-svn: 341634
* [X86] Fix some incorrect comments. NFCCraig Topper2018-09-071-3/+3
| | | | llvm-svn: 341624
* [X86] Add RMW ADC patterns with load in operand 1.Craig Topper2018-09-061-8/+22
| | | | | | | | ADC is commutable and the load could be in either operand, but we were only checking operand 0. Ideally we'd mark X86adc_flag as commutable and tablegen would automatically do this, but the EFLAGS register mention is preventing it. llvm-svn: 341606
* [X86] Add isel patterns for commuting X86adc_flag with a load in the LHS.Craig Topper2018-09-062-0/+12
| | | | | | | | The peephole pass likely gets this normally, but we should be doing it during isel. Ideally we'd just make the X86adc_flag pattern SDNPCommutable, but the tablegen doesn't handle that when one of the operands is a register reference. llvm-svn: 341596
* The initial .text section generated in object files was missing theEric Christopher2018-09-062-1/+31
| | | | | | | | | | | | | | | | | | | | SHF_ARM_PURECODE flag when being built with the -mexecute-only flag. All code sections of an ELF must have the flag set for the final .text section to be execute-only, otherwise the flag gets removed. A HasData flag is added to MCSection to aid in the determination that the section is empty. A virtual setTargetSectionFlags is added to MCELFObjectTargetWriter to allow subclasses to set target specific section flags to be added to sections which we then use in the ARM backend to set SHF_ARM_PURECODE. Patch by Ivan Lozano! Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D48792 llvm-svn: 341593
* Revert r341413Scott Linder2018-09-063-232/+67
| | | | | | Causes a regression in expensive checks. llvm-svn: 341589
* [ARC] Prevent InstPrinter from crashing on unknown condition codes.Tatyana Krasnukha2018-09-061-3/+8
| | | | | | | | | | | | | Summary: Instruction printer shouldn't crash with assertions due to incorrect input data. llvm_unreachable is not intended for runtime error handling. Reviewers: petecoup Reviewed By: petecoup Differential Revision: https://reviews.llvm.org/D51728 llvm-svn: 341581
* AMDGPU: Remove old hack for function addressesMatt Arsenault2018-09-061-13/+0
| | | | llvm-svn: 341567
* ARM64: improve non-zero memset isel by ~2xJF Bastien2018-09-061-17/+20
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: I added a few ARM64 memset codegen tests in r341406 and r341493, and annotated where the generated code was bad. This patch fixes the majority of the issues by requesting that a 2xi64 vector be used for memset of 32 bytes and above. The patch leaves the former request for f128 unchanged, despite f128 materialization being suboptimal: doing otherwise runs into other asserts in isel and makes this patch too broad. This patch hides the issue that was present in bzero_40_stack and bzero_72_stack because the code now generates in a better order which doesn't have the store offset issue. I'm not aware of that issue appearing elsewhere at the moment. <rdar://problem/44157755> Reviewers: t.p.northover, MatzeB, javed.absar Subscribers: eraman, kristof.beyls, chrib, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51706 llvm-svn: 341558
OpenPOWER on IntegriCloud