summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Implement isFMAFasterThanFMulAndFAdd for f16Matt Arsenault2016-12-221-0/+2
| | | | llvm-svn: 290307
* AMDGPU: Allow rcp and rsq usage with f16Matt Arsenault2016-12-222-4/+8
| | | | llvm-svn: 290302
* AMDGPU: Custom lower f16 fdivMatt Arsenault2016-12-222-1/+22
| | | | llvm-svn: 290301
* AMDGPU: Implement f16 fcanonicalizeMatt Arsenault2016-12-223-0/+9
| | | | llvm-svn: 290300
* AMDGPU: Update isFPImmLegal for f16Matt Arsenault2016-12-221-1/+2
| | | | | | I don't think this matters because ConstantFP is legal. llvm-svn: 290299
* Clear the PendingTypeTests vector after moving from it.Peter Collingbourne2016-12-221-0/+2
| | | | | | | This is to put the vector into a well defined state. Apparently the state of a vector after being moved from is valid but unspecified. Found with clang-tidy. llvm-svn: 290298
* [AArch64] Correct the check of signed 9-bit imm in getIndexedAddressParts().Haicheng Wu2016-12-221-2/+4
| | | | | | | | -256 is a legal indexed address part. Differential Revision: https://reviews.llvm.org/D27537 llvm-svn: 290296
* Pass GetAssumptionCache to InlineFunctionInfo constructorEaswaran Raman2016-12-221-1/+1
| | | | | | Differential revision: https://reviews.llvm.org/D28038 llvm-svn: 290295
* [NVVMIntrRange] Only set range metadata if none is already presentDavid Majnemer2016-12-221-0/+4
| | | | | | | The range metadata inserted by NVVMIntrRange is pessimistic, range metadata already present could be more precise. llvm-svn: 290294
* [LLParser] Make the line field of DIMacro(File) optional.Adrian Prantl2016-12-221-2/+2
| | | | | | Otherwise these records do not survive roundtrips. llvm-svn: 290291
* [GlobalISel] Add basic Selector-emitter tblgen backend.Ahmed Bougacha2016-12-213-6/+13
| | | | | | | | | | | | | | | | | This adds a basic tablegen backend that analyzes the SelectionDAG patterns to find simple ones that are eligible for GlobalISel-emission. That's similar to FastISel, with one notable difference: we're not fed ISD opcodes, so we need to map the SDNode operators to generic opcodes. That's done using GINodeEquiv in TargetGlobalISel.td. Otherwise, this is mostly boilerplate, and lots of filtering of any kind of "complicated" pattern. On AArch64, this is sufficient to match G_ADD up to s64 (to ADDWrr/ADDXrr) and G_BR (to B). Differential Revision: https://reviews.llvm.org/D26878 llvm-svn: 290284
* [AsmWriter] Remove redundant cast<>s. NFC.Ahmed Bougacha2016-12-211-2/+2
| | | | llvm-svn: 290283
* [WebAssembly] Fix the opcode value for i64.rotr.Dan Gohman2016-12-211-1/+1
| | | | llvm-svn: 290281
* IR: Function summary representation for type tests.Peter Collingbourne2016-12-213-8/+46
| | | | | | | | | | | Each function summary has an attached list of type identifier GUIDs. The idea is that during the regular LTO phase we would match these GUIDs to type identifiers defined by the regular LTO module and store the resolutions in a top-level "type identifier summary" (which will be implemented separately). Differential Revision: https://reviews.llvm.org/D27967 llvm-svn: 290280
* [AArch64] Remove a redundant check. NFC.Haicheng Wu2016-12-211-2/+1
| | | | | | | | The case AM.Scale == 0 is already handled by the code right above. Differential Revision: https://reviews.llvm.org/D28003 llvm-svn: 290275
* Add the ability for DWARFDie objects to get the parent DWARFDie.Greg Clayton2016-12-213-36/+56
| | | | | | | | | | | | In order for the llvm DWARF parser to be used in LLDB we will need to be able to get the parent of a DIE. This patch adds that functionality by changing the DWARFDebugInfoEntry class to store a depth field instead of a sibling index. Using a depth field allows us to easily calculate the sibling and the parent without increasing the size of DWARFDebugInfoEntry. I tested llvm-dsymutil on a debug version of clang where this fully parses DWARF in over 1200 .o files to verify there was no serious regression in performance. Added a full suite of unit tests to test this functionality. Differential Revision: https://reviews.llvm.org/D27995 llvm-svn: 290274
* Update mailing list post URL and add libunwind referenceEd Maste2016-12-211-1/+2
| | | | | | | | | | | | | RTDyldMemoryManager.cpp describes the differing __register_frame API between libunwind and libgcc, with a mailing list posting URL. The original link was 404; replace it with what I believe is the intended post, as well as a reference to the "OS X" implementation in libunwind. Differential Revision: https://reviews.llvm.org/D27965 llvm-svn: 290269
* [X86][SSE] Improve lowering of vXi64 multiplies Simon Pilgrim2016-12-212-26/+35
| | | | | | | | | | | | | | | | | | | | | | As mentioned on PR30845, we were performing our vXi64 multiplication as: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32); when we could avoid one of the upper shifts with: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi + AhiBlo, 32); This matches the lowering on gcc/icc. Differential Revision: https://reviews.llvm.org/D27756 llvm-svn: 290267
* Revert "[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp"David Majnemer2016-12-212-98/+2
| | | | | | This reverts commit r289813, it caused PR31449. llvm-svn: 290266
* AMDGPU/SI: Fix file headerTom Stellard2016-12-211-1/+1
| | | | llvm-svn: 290265
* TypeMetadataUtils: Simplify; spotted by Mehdi.Peter Collingbourne2016-12-211-2/+1
| | | | llvm-svn: 290264
* Add missing includes on Windows.Zachary Turner2016-12-211-0/+1
| | | | | | | Patch by Andrey Khalyavin Differential Revision: https://reviews.llvm.org/D27915 llvm-svn: 290263
* [LLParser] Parse vector GEP constant expression correctlyMichael Kuperstein2016-12-211-4/+7
| | | | | | | | | | | The constantexpr parsing was too constrained and rejected legal vector GEPs. This relaxes it to be similar to the ones for instruction parsing. This fixes PR30816. Differential Revision: https://reviews.llvm.org/D28013 llvm-svn: 290261
* [ConstantFolding] Fix vector GEPs harderMichael Kuperstein2016-12-211-3/+6
| | | | | | | | | | For vector GEPs, CastGEPIndices can end up in an infinite recursion, because we compare the vector type to the scalar pointer type, find them different, and then try to cast a type to itself. Differential Revision: https://reviews.llvm.org/D28009 llvm-svn: 290260
* [CostModel] Pass shuffle mask args with ArrayRef. NFCI.Simon Pilgrim2016-12-211-2/+2
| | | | llvm-svn: 290257
* revert first commit . removing empty line in X86.hMichael Zuckerman2016-12-211-1/+0
| | | | llvm-svn: 290255
* First commit adding new line to X86.hMichael Zuckerman2016-12-211-0/+1
| | | | llvm-svn: 290254
* Added a template for building target specific memory node in DAG.Elena Demikhovsky2016-12-215-117/+359
| | | | | | | | | | I added API for creation a target specific memory node in DAG. Today, all memory nodes are common for all targets and their constructors are located in SelectionDAG.cpp. There are some cases in X86 where we need to create a special node - truncation-with-saturation store, float-to-half-store. In the current patch I added truncation-with-saturation nodes and I'm using them for intrinsics. In the future I plan to implement DAG lowering for truncation-with-saturation pattern. Differential Revision: https://reviews.llvm.org/D27899 llvm-svn: 290250
* [AMDGPU] Garbage collect dead code. NFCI.Davide Italiano2016-12-211-15/+0
| | | | llvm-svn: 290249
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-4/+4
| | | | | | Fixing a warning. llvm-svn: 290248
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-4/+4
| | | | | | Fixing build issues. llvm-svn: 290244
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-216-66/+269
| | | | | | | | | | | | | The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible. vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above. The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it. This aubmit also includes additional lit tests to cover better HVAs corner cases. Differential Revision: https://reviews.llvm.org/D27392 llvm-svn: 290240
* [LDist] Match behavior between invoking via optimization pipeline or opt ↵Adam Nemet2016-12-212-32/+9
| | | | | | | | | | | | | | | | | | | | | | | | -loop-distribute In r267672, where the loop distribution pragma was introduced, I tried it hard to keep the old behavior for opt: when opt is invoked with -loop-distribute, it should distribute the loop (it's off by default when ran via the optimization pipeline). As MichaelZ has discovered this has the unintended consequence of breaking a very common developer work-flow to reproduce compilations using opt: First you print the pass pipeline of clang with -debug-pass=Arguments and then invoking opt with the returned arguments. clang -debug-pass will include -loop-distribute but the pass is invoked with default=off so nothing happens unless the loop carries the pragma. While through opt (default=on) we will try to distribute all loops. This changes opt's default to off as well to match clang. The tests are modified to explicitly enable the transformation. llvm-svn: 290235
* [APFloat] Remove 'else' after return. NFCTim Shen2016-12-211-13/+15
| | | | | | | | | | Reviewers: kbarton, iteratee, hfinkel, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27934 llvm-svn: 290232
* machine combiner: fix pretty printerSebastian Pop2016-12-212-8/+10
| | | | | | | | | | | we used to print UNKNOWN instructions when the instruction to be printer was not yet inserted in any BB: in that case the pretty printer would not be able to compute a TII as the instruction does not belong to any BB or function yet. This patch explicitly passes the TII to the pretty-printer. Differential Revision: https://reviews.llvm.org/D27645 llvm-svn: 290228
* IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI.Peter Collingbourne2016-12-212-34/+15
| | | | | | | | | No existing client is passing a non-null value here. This will come back in a slightly different form as part of the type identifier summary work. Differential Revision: https://reviews.llvm.org/D28006 llvm-svn: 290222
* [Analysis] Centralize objectsize lowering logic.George Burgess IV2016-12-204-28/+40
| | | | | | | | | We're currently doing nearly the same thing for @llvm.objectsize in three different places: two of them are missing checks for overflow, and one of them could subtly break if InstCombine gets much smarter about removing alloc sites. Seems like a good idea to not do that. llvm-svn: 290214
* Move GlobPattern class from LLD to llvm/Support.Rui Ueyama2016-12-202-0/+168
| | | | | | | | | | GlobPattern is a class to handle glob pattern matching. Currently only LLD is using that, but technically that feature is not specific to linkers, so in this patch I move that file to LLVM. Differential Revision: https://reviews.llvm.org/D27969 llvm-svn: 290212
* [SCEV] Be less conservative when extending bitwidths for computing ranges.Michael Zolotukhin2016-12-201-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | Summary: In getRangeForAffineAR we compute ranges for affine exprs E = A + B*C, where ranges for A, B, and C are known. To avoid overflow, we need to operate on a bigger bitwidth, and originally we chose 2*x+1 for this (x being the original bitwidth). However, it is safe to use just 2*x: A+B*C <= (2^x - 1) + (2^x - 1)*(2^x - 1) = = 2^x - 1 + 2^2x - 2^x - 2^x + 1 = = 2^2x - 2^x <= 2^2x - 1 Unnecessary extending of bitwidths results in noticeable slowdowns: ranges perform arithmetic operations using APInt, which are much slower when bitwidths are bigger than 64. Reviewers: sanjoy, majnemer, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27795 llvm-svn: 290211
* Revert "[ObjectYAML] Support for DWARF debug_info section"Chris Bieneman2016-12-202-41/+4
| | | | | | | | | | | This reverts commit r290204. Still breaking bots... In a meeting now, so I can't fix it immediately. Bot URL: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/2415 llvm-svn: 290209
* [ObjectYAML] Support for DWARF debug_info sectionChris Bieneman2016-12-202-4/+41
| | | | | | | | This patch adds support for YAML<->DWARF for debug_info sections. This re-lands r290147, after fixing the issue that caused bots to fail (thank you UBSan!). llvm-svn: 290204
* IR: Eliminate non-determinism in the module summary analysis.Peter Collingbourne2016-12-203-107/+82
| | | | | | | | | Also make the summary ref and call graph vectors immutable. This means a smaller API surface and fewer places to audit for non-determinism. Differential Revision: https://reviews.llvm.org/D27875 llvm-svn: 290200
* [LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC.Haicheng Wu2016-12-201-8/+8
| | | | | | | | | Make it clear that TripCount is the upper bound of the iteration on which control exits LatchBlock. Differential Revision: https://reviews.llvm.org/D26675 llvm-svn: 290199
* [ARM] Implement isExtractSubvectorCheap.Eli Friedman2016-12-202-0/+12
| | | | | | | | | | | | | | See https://reviews.llvm.org/D6678 for the history of isExtractSubvectorCheap. Essentially the same considerations apply to ARM. This temporarily breaks the formation of vpadd/vpaddl in certain cases; AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles. See https://reviews.llvm.org/D27779 for followup fix. Differential Revision: https://reviews.llvm.org/D27774 llvm-svn: 290198
* Use MaxDepth instead of repeating its valueMatt Arsenault2016-12-201-3/+3
| | | | llvm-svn: 290194
* AMDGPU: Allow 16-bit types in inline asm constraintsMatt Arsenault2016-12-201-0/+2
| | | | llvm-svn: 290193
* AMDGPU: Don't add same instruction multiple times to worklistMatt Arsenault2016-12-201-1/+7
| | | | | | | | | When the instruction is processed the first time, it may be deleted resulting in crashes. While the new test adds the same user to the worklist twice, this particular case doesn't crash but I'm not sure why. llvm-svn: 290191
* Replace std::find_if with llvm::find_if. NFC.George Burgess IV2016-12-201-5/+4
| | | | llvm-svn: 290190
* AMDGPU/SI: Make a function constTom Stellard2016-12-202-4/+3
| | | | llvm-svn: 290185
* AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*Tom Stellard2016-12-206-6/+77
| | | | | | | | | | Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184
OpenPOWER on IntegriCloud