summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Revert "AMDGPU: Add 32-bit constant address space"Rafael Espindola2018-02-0712-86/+19
| | | | | | | | This reverts commit r324487. It broke clang tests. llvm-svn: 324494
* [SelectionDAG] More Aggressibly prune nodes in AddChains. NFCI.Nirav Dave2018-02-071-1/+3
| | | | | | | | Travel all chains paths to first non-tokenfactor node can be exponential work. Add simple redundency check to avoid this. Fixes PR36264. llvm-svn: 324491
* [DebugInfo] Improvements to representation of enumeration types (PR36168)Momchil Velikov2018-02-078-26/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is the LLVM part of fixing the issues, described in https://bugs.llvm.org/show_bug.cgi?id=36168 * The representation of enumerator values in the debug info metadata now contains a boolean flag isUnsigned, which determines how the bits of the value are interpreted. * The DW_TAG_enumeration type DIE now always (for DWARF version >= 3) includes a DW_AT_type attribute, which refers to the underlying integer type, as suggested in DWARFv4 (5.7 Enumeration Type Entries). * The debug info metadata for enumeration type contains (in flags) indication whether this is a C++11 "fixed enum". * For C++11 enumeration with a fixed underlying type, the DIE also includes the DW_AT_enum_class attribute (for DWARF version >= 4). * Encoding of enumerator constants uses DW_FORM_sdata for signed values and DW_FORM_udata for unsigned values, as suggested by DWARFv4 (7.5.4 Attribute Encodings). The changes should be backwards compatible: * the isUnsigned attribute is optional and defaults to false. * if the underlying type for the enumeration is not available, the enumerator values are considered signed. * the FixedEnum flag defaults to clear. * the bitcode format for DIEnumerator stores the unsigned flag bit #1 of the first record element, so the format does not change and the zero previously stored there is consistent with the false default for IsUnsigned. Differential Revision: https://reviews.llvm.org/D42734 llvm-svn: 324489
* AMDGPU: Add 32-bit constant address spaceMarek Olsak2018-02-0712-19/+86
| | | | | | | | | | | | | | | | | | | | | | | Note: This is a candidate for LLVM 6.0, because it was planned to be in that release but was delayed due to a long review period. Merge conflict in release_60 - resolution: Add "-p6:32:32" into the second (non-amdgiz) string. Only scalar loads support 32-bit pointers. An address in a VGPR will fail to compile. That's OK because the results of loads will only be used in places where VGPRs are forbidden. Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC. The tests cover all uses cases we need for Mesa. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D41651 llvm-svn: 324487
* AMDGPU: Remove the s_buffer workaround for GFX9 chipsMarek Olsak2018-02-072-11/+2
| | | | | | | | | | | | | | | | | | Summary: I checked the AMD closed source compiler and the workaround is only needed when x3 is emulated as x4, which we don't do in LLVM. SMEM x3 opcodes don't exist, and instead there is a possibility to use x4 with the last component being unused. If the last component is out of buffer bounds and falls on the next 4K page, the hw hangs. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D42756 llvm-svn: 324486
* [X86][AVX] Add PACKSSDW/PACKUSDW support for truncation of clamped valuesSimon Pilgrim2018-02-071-4/+6
| | | | | | SSE and shorter vector sizes will have to wait until we can add support for general SMIN/SMAX matching. llvm-svn: 324485
* [SLPVectorizer][NFC] Make a loop more readable.Clement Courbet2018-02-071-7/+5
| | | | llvm-svn: 324482
* [mips] Support 'y' operand code to print exact log2 of the operandSimon Atanasyan2018-02-071-0/+7
| | | | llvm-svn: 324477
* [mips] Handle 'M' and 'L' operand codes for memory operandsSimon Atanasyan2018-02-071-6/+16
| | | | | | | | Both operand codes now work the same way in case of register or memory operands. It print high-order or low-order word in a double-word register or memory location. llvm-svn: 324476
* Re-enable "[SCEV] Make isLoopEntryGuardedByCond a bit smarter"Max Kazantsev2018-02-072-5/+71
| | | | | | | | | The failures happened because of assert which was overconfident about SCEV's proving capabilities and is generally not valid. Differential Revision: https://reviews.llvm.org/D42835 llvm-svn: 324473
* [MergeICmps] Re-commit rL324317 "Enable the MergeICmps Pass by default."Clement Courbet2018-02-071-5/+4
| | | | | | | | | | | | | | | | | | With fixes from rL324341. Original commit message: [MergeICmps] Enable the MergeICmps Pass by default. Summary: Now that PR33325 is fixed, this should always improve the generated code. Reviewers: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42793 llvm-svn: 324465
* Revert [SCEV] Make isLoopEntryGuardedByCond a bit smarterSerguei Katkov2018-02-072-71/+5
| | | | | | | | Revert rL324453 commit which causes buildbot failures. Differential Revision: https://reviews.llvm.org/D42835 llvm-svn: 324462
* Revert r324455 "[ThinLTO] - Simplify code in ThinLTOBitcodeWriter."George Rimar2018-02-071-11/+37
| | | | | | | It broke BB: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/23721 llvm-svn: 324458
* [ARM] FP16 mov imm patternSjoerd Meijer2018-02-071-3/+4
| | | | | | | | | This is a follow up of r324321, adding a match pattern for mov with a FP16 immediate (also fixing operand vfp_f16imm that wasn't even compiling). Differential Revision: https://reviews.llvm.org/D42973 llvm-svn: 324456
* [ThinLTO] - Simplify code in ThinLTOBitcodeWriter.George Rimar2018-02-071-37/+11
| | | | | | | | | | Recently introduced convertToDeclaration is very similar to code used in filterModule function. Patch reuses it to reduce duplication. Differential revision: https://reviews.llvm.org/D42971 llvm-svn: 324455
* [SCEV] Make isLoopEntryGuardedByCond a bit smarterMax Kazantsev2018-02-072-5/+71
| | | | | | | | | | | Sometimes `isLoopEntryGuardedByCond` cannot prove predicate `a > b` directly. But it is a common situation when `a >= b` is known from ranges and `a != b` is known from a dominating condition. Thia patch teaches SCEV to sum these facts together and prove strict comparison via non-strict one. Differential Revision: https://reviews.llvm.org/D42835 llvm-svn: 324453
* [LoopPrediction] Introduce utility function getLatchPredicateForGuard. NFC.Serguei Katkov2018-02-071-17/+30
| | | | | | | Factor out getting the predicate for latch condition in a guard to utility function getLatchPredicateForGuard. llvm-svn: 324450
* [x86/retpoline] Make the external thunk names exactly match the namesChandler Carruth2018-02-071-15/+44
| | | | | | | | | | | | | | | | | | | that happened to end up in GCC. This is really unfortunate, as the names don't have much rhyme or reason to them. Originally in the discussions it seemed fine to rely on aliases to map different names to whatever external thunk code developers wished to use but there are practical problems with that in the kernel it turns out. And since we're discovering this practical problems late and since GCC has already shipped a release with one set of names, we are forced, yet again, to blindly match what is there. Somewhat rushing this patch out for the Linux kernel folks to test and so we can get it patched into our releases. Differential Revision: https://reviews.llvm.org/D42998 llvm-svn: 324449
* [LegalizeDAG] Truncate condition operand of ISD::SELECTEugene Leviant2018-02-071-1/+6
| | | | | | Differential revision: https://reviews.llvm.org/D42737 llvm-svn: 324447
* AMDGPU/GlobalISel: Mark 32-bit G_FPTOUI as legalTom Stellard2018-02-071-0/+3
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D42152 llvm-svn: 324446
* Follow-up for r324429: "[LCSSAVerification] Run verification only when ↵Michael Zolotukhin2018-02-071-1/+5
| | | | | | | | | | | | | asserts are enabled." Before r324429 we essentially didn't have a verification of LCSSA, so no wonder that it has been broken: currently loop-sink breaks it (the attached test illustrates the failure). It was detected during a stage2 RA build, so to unbreak it I'm disabling the check for now. llvm-svn: 324445
* [ThinLTO] Serialize WithGlobalValueDeadStripping index flag for distributed ↵Teresa Johnson2018-02-072-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | backends Summary: A recent fix to drop dead symbols (r323633) did not work for ThinLTO distributed backends because we lose the WithGlobalValueDeadStripping set on the index during the thin link. This patch adds a new flags record to the bitcode format for the index, and serializes this flag for the combined index (it would always be 0 for the per-module index generated by the compile step, so no need to serialize the new flags record there until/unless we add another flag that applies to the per-module indexes). Generally this flag should always be set for the distributed backends, which are necessarily performed after the thin link. However, if we were to simply set this flag on the index applied to the distributed backends (invoked via clang), we would lose the ability to disable dead stripping via -compute-dead=false for debugging purposes. Reviewers: grimar, pcc Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D42799 llvm-svn: 324444
* [AMDGPU] Suppress redundant waitcnt instrs.Mark Searles2018-02-072-19/+39
| | | | | | | | | | | | 1. Run the memory legalizer prior to the waitcnt pass; keep the policy that the waitcnt pass does not remove any waitcnts within the incoming IR. 2. The waitcnt pass doesn't (yet) track waitcnts that exist prior to the waitcnt pass (it just skips over them); because the waitcnt pass is ignorant of them, it may insert a redundant waitcnt. To avoid this, check the prev instr. If it and the to-be-inserted waitcnt are the same, suppress the insertion. We keep the existing waitcnt under the assumption that whomever, e.g., the memory legalizer, inserted it knows what they were doing. 3. Follow-on work: teach the waitcnt pass to record the pre-existing waitcnts for better waitcnt production. Differential Revision: https://reviews.llvm.org/D42854 llvm-svn: 324440
* AMDGPU: Select BFI patterns with 64-bit intsMatt Arsenault2018-02-073-6/+46
| | | | llvm-svn: 324431
* [LCSSAVerification] Run verification only when asserts are enabled.Michael Zolotukhin2018-02-071-1/+3
| | | | llvm-svn: 324429
* [DAGCombiner][AMDGPU][X86] Turn cttz/ctlz into ↵Craig Topper2018-02-062-11/+14
| | | | | | | | | | | | cttz_zero_undef/ctlz_zero_undef if we can prove the input is never zero X86 currently has a late DAG combine after cttz/ctlz are turned into BSR+BSF+CMOV to detect this and remove the CMOV. But we should be able to do this much earlier and avoid creating the cmov all together. For the changed AMDGPU test case it appears that previously the i8 cttz was type legalized to i16 which introduced an OR with 256 in order to limit the result to 8 on the widened type. At this point the result is known to never be zero, but nothing checked that. Then operation legalization is told to promote all i16 cttz to i32. This introduces an extend and a truncate and another OR with 65536 to limit the result to 16. With the DAG combiner change we are able to prevent the creation of the second OR since the opcode will have been changed to cttz_zero_undef after the first OR. I the lack of the OR caused the instruction to change to v_ffbl_b32_sdwa Differential Revision: https://reviews.llvm.org/D42985 llvm-svn: 324427
* Add DWARF for discriminated unionsAdrian Prantl2018-02-0611-33/+99
| | | | | | | | | | | | | | | | | | | | | | | | | n Rust, an enum that carries data in the variants is, essentially, a discriminated union. Furthermore, the Rust compiler will perform space optimizations on such enums in some situations. Previously, DWARF for these constructs was emitted using a hack (a magic field name); but this approach stopped working when more space optimizations were added in https://github.com/rust-lang/rust/pull/45225. This patch changes LLVM to allow discriminated unions to be represented in DWARF. It adds createDiscriminatedUnionType and createDiscriminatedMemberType to DIBuilder and then arranges for this to be emitted using DWARF's DW_TAG_variant_part and DW_TAG_variant. Note that DWARF requires that a discriminated union be represented as a structure with a variant part. However, as Rust only needs to emit pure discriminated unions, this is what I chose to expose on DIBuilder. Patch by Tom Tromey! Differential Revision: https://reviews.llvm.org/D42082 llvm-svn: 324426
* Place undefined globals in .bss instead of .dataEli Friedman2018-02-061-1/+14
| | | | | | | | | | | | | | | | | | | | | | Following up on the discussion from http://lists.llvm.org/pipermail/llvm-dev/2017-April/112305.html, undef values are now placed in the .bss as well as null values. This prevents undef global values taking up potentially huge amounts of space in the .data section. The following two lines now both generate equivalent .bss data: @vals1 = internal unnamed_addr global [20000000 x i32] zeroinitializer, align 4 @vals2 = internal unnamed_addr global [20000000 x i32] undef, align 4 ; previously unaccounted for This is primarily motivated by the corresponding issue in the Rust compiler (https://github.com/rust-lang/rust/issues/41315). Differential Revision: https://reviews.llvm.org/D41705 Patch by varkor! llvm-svn: 324424
* [LivePhysRegs] Fix handling of return instructions.Eli Friedman2018-02-061-17/+14
| | | | | | | | | | | | | | | | | See D42509 for the original version of this. Basically, there are two significant changes to behavior here: - addLiveOuts always adds all pristine registers (even if a block has no successors). - addLiveOuts and addLiveOutsNoPristines always add all callee-saved registers for return blocks (including conditional return blocks). I cleaned up the functions a bit to make it clear these properties hold. Differential Revision: https://reviews.llvm.org/D42655 llvm-svn: 324422
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-02-061-4/+4
| | | | | | | Fix the modeling of long division and SIMD conversion from integer and horizontal minimum and maximum. llvm-svn: 324417
* Add SelectionDAGDumper support for strict FP nodesAndrew Kaylor2018-02-061-0/+20
| | | | | | Patch by Kevin P. Neal llvm-svn: 324416
* Fix a crash when emitting DIEs for variable-length arraysAdrian Prantl2018-02-063-33/+50
| | | | | | | | | | | | | VLAs may refer to a previous DIE to express the DW_AT_count of their type. Clang generates an artificial "vla_expr" variable for this. If this DIE hasn't been created yet LLVM asserts. This patch fixes this by sorting the local variables so that dependencies come before they are needed. It also replaces the linear scan in DWARFFile with a std::map, which can be faster. Differential Revision: https://reviews.llvm.org/D42940 llvm-svn: 324412
* [ORC] Use explicit constructor calls to fix a builder error atLang Hames2018-02-061-3/+3
| | | | | | http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/17627 llvm-svn: 324411
* [ORC] Start migrating ORC layers to use the new ORC Core.h APIs.Lang Hames2018-02-064-86/+251
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In particular this patch switches RTDyldObjectLinkingLayer to use orc::SymbolResolver and threads the requried changse (ExecutionSession references and VModuleKeys) through the existing layer APIs. The purpose of the new resolver interface is to improve query performance and better support parallelism, both in JIT'd code and within the compiler itself. The most visibile change is switch of the <Layer>::addModule signatures from: Expected<Handle> addModule(std::shared_ptr<ModuleType> Mod, std::shared_ptr<JITSymbolResolver> Resolver) to: Expected<Handle> addModule(VModuleKey K, std::shared_ptr<ModuleType> Mod); Typical usage of addModule will now look like: auto K = ES.allocateVModuleKey(); Resolvers[K] = createSymbolResolver(...); Layer.addModule(K, std::move(Mod)); See the BuildingAJIT tutorial code for example usage. llvm-svn: 324405
* [DSE] Upgrade uses of MemoryIntrinic::getAlignment() to new API. (NFC)Daniel Neilson2018-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the DeadStoreElimination pass to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting dest specific alignments through the new API. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324402
* [TargetLowering] use local variable to reduce duplication; NFCISanjay Patel2018-02-061-52/+32
| | | | llvm-svn: 324401
* [TargetLowering] use local variables to reduce duplication; NFCISanjay Patel2018-02-061-6/+6
| | | | llvm-svn: 324397
* [InferAddressSpaces] Update uses of IRBuilder memory intrinsic creation to ↵Daniel Neilson2018-02-061-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | new API Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the InferAddressSpaces pass to cease using: 1) The old getAlignment() API of MemoryIntrinsic in favour of getting source & dest specific alignments through the new API. 2) The old IRBuilder CreateMemCpy/CreateMemMove single-alignment APIs in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324395
* [DWARFv5] Emit .debug_line_str (in a non-DWO file).Paul Robinson2018-02-063-23/+124
| | | | | | | | This should enable the linker to do string-pooling of path names. Differential Revision: https://reviews.llvm.org/D42707 llvm-svn: 324393
* [Hexagon] Extract HVX lowering and selection into HVX-specific files, NFCKrzysztof Parzyszek2018-02-066-581/+572
| | | | llvm-svn: 324392
* [Hexagon] Lower concat of more than 2 vectors into build_vectorKrzysztof Parzyszek2018-02-062-15/+24
| | | | llvm-svn: 324391
* [InlineFunction] Update deprecated use of IRBuilder CreateMemCpy (NFC)Daniel Neilson2018-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the InlineFunction pass to ceause using the old IRBuilder CreateMemCpy single-alignment API in favour of the new API that allows setting source and destination alignments independently. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html llvm-svn: 324384
* [AMDGPU] removed dead code handling rmw in memory legalizerStanislav Mekhanoshin2018-02-061-66/+3
| | | | | | | | | It was always using cmpxchg path and in rmw and cmpxchg instructions are not distinguishable in the BE. Differential Revision: https://reviews.llvm.org/D42976 llvm-svn: 324383
* [Hexagon] Don't form new-value jumps from floating-point instructionsKrzysztof Parzyszek2018-02-061-0/+16
| | | | | | | Additionally, verify that the register defined by the producer is a 32-bit register. llvm-svn: 324381
* [InstCombine][ValueTracking] Match non-uniform constant power-of-two vectorsSimon Pilgrim2018-02-061-8/+5
| | | | | | | | Generalize existing constant matching to work with non-uniform constant vectors as well. Differential Revision: https://reviews.llvm.org/D42818 llvm-svn: 324369
* [ARM] f16 conversionsSjoerd Meijer2018-02-061-16/+23
| | | | | | | | | This is a follow up of r324321, adding f16 <-> f32 and f16 <-> f64 conversion match patterns. Differential Revision: https://reviews.llvm.org/D42954 llvm-svn: 324360
* [DAG, X86] Improve Dependency analysis when doing multi-nodeNirav Dave2018-02-062-241/+119
| | | | | | | | | | | | | | | | | | | | Instruction Selection Cleanup cycle/validity checks in ISel (IsLegalToFold, HandleMergeInputChains) and X86 (isFusableLoadOpStore). Now do a full search for cycles / dependencies pruning the search when topological property of NodeId allows. As part of this propogate the NodeId-based cutoffs to narrow hasPreprocessorHelper searches. Reviewers: craig.topper, bogner Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D41293 llvm-svn: 324359
* AMDGPU: Fix S_BUFFER_LOAD_DWORD_SGPR moveToVALUMarek Olsak2018-02-061-2/+8
| | | | | | | | Author: Bas Nieuwenhuizen https://reviews.llvm.org/D42881 llvm-svn: 324353
* [Hexagon] Remove leftover assertKrzysztof Parzyszek2018-02-061-3/+1
| | | | llvm-svn: 324352
* [Hexagon] Split HVX operations on vector pairsKrzysztof Parzyszek2018-02-064-70/+265
| | | | | | | | Vector pairs are legal types, but not every operation can work on pairs. For those operations that are legal for single vectors, generate a concat of their results on pair halves. llvm-svn: 324350
OpenPOWER on IntegriCloud