summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Add missing record form instructions to the P9 Scheduling ModelStefan Pintilie2017-10-102-1/+32
| | | | | | | | | A number of record form instructions were missing from the P9 scheduling model. Added those instructions and marked the P9 model as complete. Differential Revision: https://reviews.llvm.org/D38560 llvm-svn: 315313
* after fixing the i386 caseUriel Korach2017-10-101-2/+2
| | | | | Change-Id: If6fe0b6ec01f111115fb734fe31c0e152dbc165f llvm-svn: 315311
* [mips] Partially fix PR34391Simon Dardis2017-10-101-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | Previously, the parsing of the 'subu $reg, ($reg,) imm' relied on a parser which also rendered the operand to the instruction. In some cases the general parser could construct an MCExpr which was not a MCConstantExpr which MipsAsmParser was expecting. Address this by altering the special handling to cope with unexpected inputs and fine-tune the handling of cases where an register name that is not available in the current ABI is regarded as not a match for the custom parser but also not as an outright error. Also enforces the binutils restriction that only constants are accepted. This partially resolves PR34391. Thanks to Ed Maste for reporting the issue! Reviewers: nitesh.jain, arichardson Differential Revision: https://reviews.llvm.org/D37476 llvm-svn: 315310
* [DAGCombine] Fix for shuffle to vector extend for non power 2 vectorsDavid Stuttard2017-10-101-0/+3
| | | | | | | | | | | | | | | | | | | | | Summary: See https://llvm.org/PR33743 for more details It seems that for non-power of 2 vector sizes, the algorithm can produce non-matching sizes for input and result causing an assert. This usually isn't a problem as the isAnyExtend check will weed these out, but in some cases (most often with lots of undefined values for the mask indices) it can pass this check for non power of 2 vectors. Adding in an extra check that ensures that bit size will match for the result and input (as required) Subscribers: nhaehnle Differential Revision: https://reviews.llvm.org/D35241 llvm-svn: 315307
* [ARM, Asm] Harden GNU LDRD/STRD aliases against invalid inputsOliver Stannard2017-10-101-19/+49
| | | | | | | | | | | | | | | | | | | Previously, the code that implemented the GNU assembler aliases for the LDRD and STRD instructions (where the second register is omitted) assumed that the input was a valid instruction. This caused assertion failures for every example in ldrd-strd-gnu-bad-inst.s. This improves this code so that it bails out if the instruction is not in the expected format, the check bails out, and the asm parser is run on the unmodified instruction. It also relaxes the alias on thumb targets, so that unaligned pairs of registers can be used. The restriction that Rt must be even-numbered only applies to the ARM versions of these instructions. Differential revision: https://reviews.llvm.org/D36732 llvm-svn: 315305
* [ARM, Asm] Add diagnostics for floating-point register operandsOliver Stannard2017-10-102-5/+22
| | | | | | | | | | | | | | | This adds diagnostic strings for the ARM floating-point register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, DPR, requires C++ code to select the correct error message, as that class contains different registers depending on the FPU. The rest can all have their diagnostic strings stored in the tablegen decription of them. Differential revision: https://reviews.llvm.org/D36693 llvm-svn: 315304
* [ARM, Asm] Add diagnostics for general-purpose register operandsOliver Stannard2017-10-102-5/+34
| | | | | | | | | | | | | | | This adds diagnostic strings for the ARM general-purpose register classes, which will be used when these classes are expected by the assembler, but the provided operand is not valid. One of these, rGPR, requires C++ code to select the correct error message, as that class contains different registers in pre-v8 and v8 targets. The rest can all have their diagnostic strings stored in the tablegen description of them. Differential revision: https://reviews.llvm.org/D36692 llvm-svn: 315303
* AMDGPU: Split MUBUF offset into aligned componentsNicolai Haehnle2017-10-101-10/+16
| | | | | | | | | | | | | | | | | | | | Summary: Atomic buffer operations do not work (and trap on gfx9) when the components are unaligned, even if their sum is aligned. Previously, we generated an offset of 4156 without an SGPR by splitting it as 4095 + 61 (immediate + inline constant). The highest offset for which we can do this correctly is 4156 = 4092 + 64. Fixes dEQP-GLES31.functional.ssbo.atomic.* Reviewers: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37850 llvm-svn: 315302
* Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"Jonas Devlieghere2017-10-101-62/+0
| | | | | | This reverts commit r315297. llvm-svn: 315299
* [llvm-dwarfdump] Print type names in DW_AT_type DIEsJonas Devlieghere2017-10-101-0/+62
| | | | | | | | | | This patch adds printing for DW_AT_type DIEs like it is already the case for DW_AT_specification DIEs. This is a rather naive approach and only a start. We should have pretty printers for different languages. Differential revision: https://reviews.llvm.org/D36993 llvm-svn: 315297
* [SCCP] Fix mem-sanitizer failure introduced by r315288.Florian Hahn2017-10-101-2/+4
| | | | llvm-svn: 315294
* [SCCP] Propagate integer range info for parameters in IPSCCP.Florian Hahn2017-10-101-8/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288
* Fix for PR34888.Nemanja Ivanovic2017-10-101-3/+4
| | | | | | | | | | The issue is that we assume operand zero of the input to the add instruction is a register. In this case, the input comes from inline assembly and operand zero is not a register thereby causing a crash. The code will bail anyway if the input instruction doesn't have the right opcode. So do that check first and let short-circuiting prevent the crash. llvm-svn: 315285
* SILoadStoreOptimizer.cpp: Fix build; Clang doesn't like "using anonymous ↵NAKAMURA Takumi2017-10-101-1/+1
| | | | | | struct" since rL315256. llvm-svn: 315283
* Re-land "[MergeICmps] Disable mergeicmps if the target does not want to ↵Clement Courbet2017-10-101-12/+36
| | | | | | | | | | handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281
* Ignore all duplicate frame index expressionBjorn Steinbrink2017-10-102-24/+26
| | | | | | | | | | | | | | | | | Some passes might duplicate calls to llvm.dbg.declare creating duplicate frame index expression which currently trigger an assertion which is meant to catch erroneous, overlapping fragment declarations. But identical frame index expressions are just redundant and don't actually conflict with each other, so we can be more lenient and just ignore the duplicates. Reviewers: aprantl, rnk Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D38540 llvm-svn: 315279
* [RISCV] Fix build after r315254Alex Bradbury2017-10-101-2/+3
| | | | | | | createELFObjectWriter now takes a std::unique_ptr<MCELFObjectTargetWriter> rather than a MCELFObjectTargetWriter*. llvm-svn: 315275
* [AVX512] Add patterns to commute integer comparison instructions during isel.Craig Topper2017-10-101-0/+41
| | | | | | This enables broadcast loads to be commuted and allows normal loads to be folded without the peephole pass. llvm-svn: 315274
* Renable r314928Xinliang David Li2017-10-102-0/+239
| | | | | | | | | | | Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272
* [MC] Properly diagnose badly scoped .cfi_ directivesReid Kleckner2017-10-101-38/+66
| | | | | | | | | | Removes two report_fatal_errors. Implement this by removing EmitCFICommon, and do the checking in getCurrentDwarfFrameInfo. Have the callers check for null before dereferencing it. llvm-svn: 315264
* [SEH] Use reportError instead of report_fatal_error for bad directivesReid Kleckner2017-10-105-155/+190
| | | | | | | | | | This makes the .seh_ directives slightly more usable from standalone assembly files. This removes a large number of report_fatal_errors and recovers from the error by ignoring the directive. llvm-svn: 315262
* [MC] Plumb unique_ptr<MCWasmObjectTargetWriter> through createWasmObjectWriterLang Hames2017-10-102-7/+10
| | | | | | | | | | to WasmObjectWriter's constructor. Fixes the same ownership issue for COFF that r315245 did for MachO: WasmObjectWriter takes ownership of its MCWasmObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315260
* [MC] Suppress .Lcfi labels when emitting textual assemblyReid Kleckner2017-10-102-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This suppresses the generation of .Lcfi labels in our textual assembler. It was annoying that this generated cascading .Lcfi labels: llc foo.ll -o - | llvm-mc | llvm-mc After three trips through MCAsmStreamer, we'd have three labels in the output when none are necessary. We should only bother creating the labels and frame data when making a real object file. This supercedes D38605, which moved the entire .seh_ implementation into MCObjectStreamer. This has the advantage that we do more checking when emitting textual assembly, as a minor efficiency cost. Outputting textual assembly is not performance critical, so this shouldn't matter. Reviewers: majnemer, MatzeB Subscribers: qcolombet, nemanjai, javed.absar, eraman, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D38638 llvm-svn: 315259
* Fix Wasm build after r315254Reid Kleckner2017-10-101-3/+2
| | | | llvm-svn: 315258
* [MC] Plumb unique_ptr<MCWinCOFFObjectTargetWriter> throughLang Hames2017-10-104-14/+14
| | | | | | | | | | | createWinCOFFObjectWriter to WinCOFFObjectWriter's constructor. Fixes the same ownership issue for COFF that r315245 did for MachO: WinCOFFObjectWriter takes ownership of its MCWinCOFFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315257
* [MC] Plumb unique_ptr<MCELFObjectTargetWriter> through createELFObjectWriter toLang Hames2017-10-0912-33/+33
| | | | | | | | | | ELFObjectWriter's constructor. Fixes the same ownership issue for ELF that r315245 did for MachO: ELFObjectWriter takes ownership of its MCELFObjectTargetWriter, so we want to pass this through to the constructor via a unique_ptr, rather than a raw ptr. llvm-svn: 315254
* Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*Adam Nemet2017-10-0929-30/+30
| | | | | | | Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249
* [MC] Plumb unique_ptr<MCMachObjectTargetWriter> through createMachObjectWriterLang Hames2017-10-095-14/+12
| | | | | | | | | | | to MCObjectWriter's constructor. MCObjectWriter takes ownership of its MCMachObjectTargetWriter argument -- this patch plumbs that ownership relationship through the constructor (which previously took raw MCMachObjectTargetWriter*) and the createMachObjectWriter function. llvm-svn: 315245
* [GISel]: Fix generation of illegal COPYs during CallLoweringAditya Nandakumar2017-10-094-16/+58
| | | | | | | | | | | We end up creating COPY's that are either truncating/extending and this should be illegal. https://reviews.llvm.org/D37640 Patch for X86 and ARM by igorb, rovka llvm-svn: 315240
* [X86] Unsigned saturation subtraction canonicalization [the backend part]Zvi Rackover2017-10-091-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: On behalf of julia.koval@intel.com The patch transforms canonical version of unsigned saturation, which is sub(max(a,b),a) or sub(a,min(a,b)) to special psubus insturuction on targets, which support it(8bit and 16bit uints). umax(a,b) - b -> subus(a,b) a - umin(a,b) -> subus(a,b) There is also extra case handled, when right part of sub is 32 bit and can be truncated, using UMIN(this transformation was discussed in https://reviews.llvm.org/D25987). The example of special case code: ``` void foo(unsigned short *p, int max, int n) { int i; unsigned m; for (i = 0; i < n; i++) { m = *--p; *p = (unsigned short)(m >= max ? m-max : 0); } } ``` Max in this example is truncated to max_short value, if it is greater than m, or just truncated to 16 bit, if it is not. It is vaid transformation, because if max > max_short, result of the expression will be zero. Here is the table of types, I try to support, special case items are bold: | Size | 128 | 256 | 512 | ----- | ----- | ----- | ----- | i8 | v16i8 | v32i8 | v64i8 | i16 | v8i16 | v16i16 | v32i16 | i32 | | **v8i32** | **v16i32** | i64 | | | **v8i64** Reviewers: zvi, spatel, DavidKreitzer, RKSimon Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37534 llvm-svn: 315237
* [MC] Use a unique_ptr<MCAssembler> for MCObjectStreamer's Assembler member.Lang Hames2017-10-091-3/+2
| | | | | | Removes manual new/delete. llvm-svn: 315225
* [InstCombine] fix formatting; NFCSanjay Patel2017-10-091-9/+7
| | | | llvm-svn: 315223
* Fix after r315079Adrian McCarthy2017-10-091-1/+1
| | | | | | | | | | | | Microsoft's debug implementation of std::copy checks if the destination is an array and then does some bounds checking. This was causing an assertion failure in fs::rename_internal which copies to a buffer of the appropriate size but that's type-punned to an array of length 1 for API compatibility reasons. Fix is to make make the destination a pointer rather than an array. llvm-svn: 315222
* [DAG] combine assertsexts around a truncSanjay Patel2017-10-091-10/+10
| | | | | | | This was a suggested follow-up to: D37017 / https://reviews.llvm.org/rL313577 llvm-svn: 315206
* [AArch64] Improve codegen for inverted overflow checking intrinsicsAmara Emerson2017-10-091-9/+33
| | | | | | | | | | | | | | E.g. if we have a (xor(overflow-bit), 1) where overflow-bit comes from an intrinsic like llvm.sadd.with.overflow then we can kill the xor and use the inverted condition code for the CSEL. rdar://28495949 Reviewed By: kristof.beyls Differential Revision: https://reviews.llvm.org/D38160 llvm-svn: 315205
* [X86] Remove a setLoadExtAction from the AVX512 section that uses an ↵Craig Topper2017-10-091-1/+0
| | | | | | AVX512BW type and is alraedy present in the AVX512BW section. llvm-svn: 315202
* [X86] Enable extended comparison predicate support for SETUEQ/SETONE when ↵Craig Topper2017-10-092-18/+16
| | | | | | | | | | targeting AVX instructions. We believe that despite AMD's documentation, that they really do support all 32 comparision predicates under AVX. Differential Revision: https://reviews.llvm.org/D38609 llvm-svn: 315201
* [X86][SSE] Don't call combineTo inside combineX86ShufflesRecursively. NFCI.Simon Pilgrim2017-10-081-51/+60
| | | | | | | | Return the combined shuffle from combineX86ShufflesRecursively and perform the combineTo in the caller. Makes it easier for future patches to use this in functions that aren't actually shuffles themselves. llvm-svn: 315195
* Tidyup with clang-format. NFCI.Simon Pilgrim2017-10-081-8/+5
| | | | llvm-svn: 315187
* Remove unused variables. No functionality change.Benjamin Kramer2017-10-0810-13/+2
| | | | llvm-svn: 315185
* [X86] getTargetConstantBitsFromNode - add support for decoding scalar constantsSimon Pilgrim2017-10-081-0/+7
| | | | llvm-svn: 315182
* [X86] Prefer MOVSS/SD over BLENDI during legalization. Remove BLENDI ↵Craig Topper2017-10-083-92/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | versions of scalar arithmetic patterns Summary: We currently disable some converting of shuffles to MOVSS/MOVSD during legalization if SSE41 is enabled. But later during shuffle combining we go back to prefering MOVSS/MOVSD. Additionally we have patterns that look for BLENDIs to detect scalar arithmetic operations. I believe due to the combining using MOVSS/MOVSD these are unnecessary. Interestingly, we still codegen blend instructions even though lowering/isel emit movss/movsd instructions. Turns out machine CSE commutes them to blend, and then commuting those blends back into blends that are equivalent to the original movss/movsd. This patch fixes the inconsistency in legalization to prefer MOVSS/MOVSD. The one test change was caused by this change. The problem is that we have integer types and are mostly selecting integer instructions except for the shufps. This shufps forced the execution domain, but the vpblendw couldn't have its domain changed with a naive instruction swap. We could fix this by special casing VPBLENDW based on the immediate to widen the element type. The rest of the patch is removing all the excess scalar patterns. Long term we should probably add isel patterns to make MOVSS/MOVSD emit blends directly instead of relying on the double commute. We may also want to consider emitting movss/movsd for optsize. I also wonder if we should still use the VEX encoded blendi instructions even with AVX512. Blends have better throughput, and that may outweigh the register constraint. Reviewers: RKSimon, zvi Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38023 llvm-svn: 315181
* [AArch64][GlobalISel] Make G_PHI of p0 types legal.Amara Emerson2017-10-081-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D38621 llvm-svn: 315177
* [X86][SKX] Adding the scheduling information for the SKX target.Gadi Haber2017-10-083-2/+6952
| | | | | | | | | | | | | | | | | | Adding the scheduling information for the SkylakeServer (SKX) target. This patch adds the instruction scheduling information for the SkylakeServer (SKX) architecture target by adding the file X86SchedSkylakeServer.td located under the X86 Target. We used the scheduling information retrieved from the Skylake architects in order to create the file. The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction. The patch continues the scheduling replacement and insertion effort started with the SNB target in r310792, the HSW target in r311879 and the SkylakeClient (SKL) target in rL313613. Please expect some performance fluctuations due to code alignment effects. Reviewers: zvi, RKSimon, craig.topper, chandlerc, aymanmu Differential Revision: https://reviews.llvm.org/D38443 Change-Id: I5c228fcc09e9e5a99b6116e62b356c4f9b971185 llvm-svn: 315175
* [X86] Add missing entries in 'MemoryFoldTable2Addr' to get complete form of ↵Ayman Musa2017-10-081-0/+50
| | | | | | | | | | the table. Get the folding table 'MemoryFoldTable2Addr' to a complete state as part of the process explained in https://reviews.llvm.org/D38028 Differential Revision: https://reviews.llvm.org/D38500 llvm-svn: 315174
* [X86][TableGen] Recommitting the X86 memory folding tables TableGen backend ↵Ayman Musa2017-10-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | while disabling it by default. After the original commit ([[ https://reviews.llvm.org/rL304088 | rL304088 ]]) was reverted, a discussion in llvm-dev was opened on 'how to accomplish this task'. In the discussion we concluded that the best way to achieve our goal (which is to automate the folding tables and remove the manually maintained tables) is: # Commit the tablegen backend disabled by default. # Proceed with an incremental updating of the manual tables - while checking the validity of each added entry. # Repeat previous step until we reach a state where the generated and the manual tables are identical. Then we can safely remove the manual tables and include the generated tables instead. # Schedule periodical (1 week/2 weeks/1 month) runs of the pass: - if changes appear (new entries): - make sure the entries are legal - If they are not, mark them as illegal to folding - Commit the changes (if there are any). CMake flag added for this purpose is "X86_GEN_FOLD_TABLES". Building with this flags will run the pass and emit the X86GenFoldTables.inc file under build/lib/Target/X86/ directory which is a good reference for any developer who wants to take part in the effort of completing the current folding tables. Differential Revision: https://reviews.llvm.org/D38028 llvm-svn: 315173
* [X86] Stop LowerSIGN_EXTEND_AVX512 from creating v8i16/v16i16/v16i8 vselects ↵Craig Topper2017-10-081-1/+6
| | | | | | | | with a v8i1/v16i1 condition when BWI is not available. Some of the tests in vector-shuffle-v1.ll would get into an infinite loop without this. llvm-svn: 315172
* [X86] Add new attribute to X86 instructions to enable marking them as "not ↵Ayman Musa2017-10-085-42/+58
| | | | | | | | | | | memory foldable" This attribute will be used in a tablegen backend that generated the X86 memory folding tables which will be added in a future pass. Instructions with this attribute unset will be excluded from the full set of X86 instructions available for the pass. Differential Revision: https://reviews.llvm.org/D38027 llvm-svn: 315171
* [X86] Simplify some code in getInsertVINSERTImmediate and ↵Craig Topper2017-10-081-4/+2
| | | | | | | | getExtractVEXTRACTImmediate. NFC Replace one of the divides with a multiply. llvm-svn: 315162
* [X86] If we see an insert of a bitcast into zero vector, canonicalize it to ↵Craig Topper2017-10-082-1/+16
| | | | | | | | move the bitcast to the other side of the insert. This improves detection of zeroing of upper bits during isel. llvm-svn: 315161
OpenPOWER on IntegriCloud