summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Remove unused private member of AMDGPUTargetELFStreamerKonstantin Zhuravlyov2018-02-162-2/+1
| | | | llvm-svn: 325408
* Remove an unused function.Eric Christopher2018-02-161-4/+0
| | | | llvm-svn: 325403
* AMDGPU: Bring elf flags in sync with the specKonstantin Zhuravlyov2018-02-166-31/+116
| | | | | | | | | | | - Add MACH flags - Add XNACK flag - Add reserved flags - Minor cleanups in docs Differential Revision: https://reviews.llvm.org/D43356 llvm-svn: 325399
* [Constant] add floating-point helpers for normal/finite-nz; NFCSanjay Patel2018-02-162-42/+39
| | | | | | | | | ...and delete the equivalent local functiona from InstCombine. These might be useful to other InstCombine files or other passes and makes FP queries more similar to integer constant queries. llvm-svn: 325398
* [X86] In lowerVSELECTtoVectorShuffle, don't map undef select condition to ↵Craig Topper2018-02-161-3/+6
| | | | | | | | | | undef in shuffle mask. Undef in select condition means we should pick the element from one side or the other. An undef in a shuffle mask means pick any element from either source or worse. I suspect by the time we get here most of the undefs in a constant vector have been removed by other things, but doing this for safety. llvm-svn: 325394
* AMDGPU: Bring processors and features in sync with the specKonstantin Zhuravlyov2018-02-164-18/+6
| | | | | | | | | | - Remove gfx800 - Make iceland gfx802 - Add xnack to gfx902 Differential Revision: https://reviews.llvm.org/D43355 llvm-svn: 325393
* Fix emission of PDB string table.Zachary Turner2018-02-164-171/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally reported as a bug with the symptom being "cvdump crashes when printing an LLD-linked PDB that has an S_FILESTATIC record in it". After some additional investigation, I determined that this was a symptom of a larger problem, and in fact the real problem was in the way we emitted the global PDB string table. As evidence of this, you can take any lld-generated PDB, run cvdump -stringtable on it, and it would return no results. My hypothesis was that cvdump could not *find* the string table to begin with. Normally it would do this by looking in the "named stream map", finding the string /names, and using its value as the stream index. If this lookup fails, then cvdump would fail to load the string table. To test this hypothesis, I looked at the name stream map generated by a link.exe PDB, and I emitted exactly those bytes into an LLD-generated PDB. Suddenly, cvdump could read our string table! This code has always been hacky and we knew there was something we didn't understand. After all, there were some comments to the effect of "we have to emit strings in a specific order, otherwise things don't work". The key to fixing this was finally understanding this. The way it works is that it makes use of a generic serializable hash map that maps integers to other integers. In this case, the "key" is the offset into a buffer, and the value is the stream number. If you index into the buffer at the offset specified by a given key, you find the name. The underlying cause of all these problems is that we were using the identity function for the hash. i.e. if a string's offset in the buffer was 12, the hash value was 12. Instead, we need to hash the string *at that offset*. There is an additional catch, in that we have to compute the hash as a uint32 and then truncate it to uint16. Making this work is a little bit annoying, because we use the same hash table in other places as well, and normally just using the identity function for the hash function is actually what's desired. I'm not totally happy with the template goo I came up with, but it works in any case. The reason we never found this bug through our own testing is because we were building a /parallel/ hash table (in the form of an llvm::StringMap<>) and doing all of our lookups and "real" hash table work against that. I deleted all of that code and now everything goes through the real hash table. Then, to test it, I added a unit test which adds 7 strings and queries the associated values. I test every possible insertion order permutation of these 7 strings, to verify that it really does work as expected. Differential Revision: https://reviews.llvm.org/D43326 llvm-svn: 325386
* Remove useless comment - seems to be a copy+paste typo. NFCISimon Pilgrim2018-02-161-1/+0
| | | | llvm-svn: 325385
* [AArch64] Fix BITCAST lowering crashEvandro Menezes2018-02-161-1/+2
| | | | | | | | | | | The data type is assumed to be a vector, but sometimes it is not, leading to an assertion. Add simple test-case to verify this. Differential revision: https://reviews.llvm.org/D42599 llvm-svn: 325378
* AMDGPU/SI: Extend promoting alloca to vector to arrays of up to 16 elementsChangpeng Fang2018-02-161-1/+12
| | | | | | | | | | | | | | Summary: This patch extends the promotion of alloca to vector to the arrays of up to 16 elements. Also we introduce an option, -disable-promote-alloca-to-vector, to switch promotion to vector off, if needed. Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D33559 llvm-svn: 325372
* [X86] Only reorder srl/and on last DAG combiner runCraig Topper2018-02-161-2/+8
| | | | | | | | | | This seems to interfere with a target independent brcond combine that looks for the (srl (and X, C1), C2) pattern to enable TEST instructions. Once we flip, that combine doesn't fire and we end up exposing it to the X86 specific BT combine which causes us to emit a BT instruction. BT has lower throughput than TEST. We could try to make the brcond combine aware of the alternate pattern, but since the flip was just a code size reduction and not likely to enable other combines, it seemed easier to just delay it until after lowering. Differential Revision: https://reviews.llvm.org/D43201 llvm-svn: 325371
* [X86] Remove call to ShrinkDemandedCosntant from the SHRUNKBLEND creation code.Craig Topper2018-02-161-2/+1
| | | | | | We only run this code if know the condition isn't a constant vector. ShrinkDemandedConstant isn't going to find any different. llvm-svn: 325368
* [WebAssembly] MC: Make explicit our current lack of support for relocations ↵Sam Clegg2018-02-162-6/+19
| | | | | | | | | | | | against unnamed temporary symbols. Add an explicit check before looking up symbol in SymbolIndices. This was previously silently succeeding and returning zero for such unnamed temporaries. Differential Revision: https://reviews.llvm.org/D43365 llvm-svn: 325367
* [InstCombine] clean up fdiv-with-fdiv folds; NFCISanjay Patel2018-02-161-28/+23
| | | | llvm-svn: 325366
* Fix signed/unsigned comparison warning. NFCI.Simon Pilgrim2018-02-161-1/+1
| | | | llvm-svn: 325363
* Fix signed/unsigned comparison warning. NFCI.Simon Pilgrim2018-02-161-3/+3
| | | | llvm-svn: 325359
* [InstCombine] remove redundant debug info setting; NFCSanjay Patel2018-02-161-2/+0
| | | | | | The IRBuilder sets debuginfo in Insert(), so this was duplicating what already happened. llvm-svn: 325358
* [JumpThreading] PR36133 enable/disable DominatorTree for LVI analysisBrian M. Rzycki2018-02-163-1/+69
| | | | | | | | | | | | | | | | | | | | | | Summary: The LazyValueInfo pass caches a copy of the DominatorTree when available. Whenever there are pending DominatorTree updates within JumpThreading's DeferredDominance object we cannot use the cached DT for LVI analysis. This commit adds the new methods enableDT() and disableDT() to LVI. JumpThreading also sets the appropriate usage model before calling LVI analysis methods. Fixes https://bugs.llvm.org/show_bug.cgi?id=36133 Reviewers: sebpop, dberlin, kuhar Reviewed by: sebpop, kuhar Subscribers: uabelho, llvm-commits, aprantl, hiraditya, a.elovikov Differential Revision: https://reviews.llvm.org/D42717 llvm-svn: 325356
* AMDGPU/SI: Turn off GPR Indexing Mode immediately after the interested ↵Changpeng Fang2018-02-161-38/+23
| | | | | | | | | | | | | | | | | | instruction. Summary: In the current implementation of GPR Indexing Mode when the index is of non-uniform, the s_set_gpr_idx_off instruction is incorrectly inserted after the loop. This will lead the instructions with vgpr operands (v_readfirstlane for example) to read incorrect vgpr. In this patch, we fix the issue by inserting s_set_gpr_idx_on/off immediately around the interested instruction. Reviewers: rampitec Differential Revision: https://reviews.llvm.org/D43297 llvm-svn: 325355
* [SelectionDAG] Enable SimplifyDemandedVectorElts support for simplifying ↵Simon Pilgrim2018-02-161-0/+25
| | | | | | | | | | shuffle masks Based off the DemandedElts mask the and UNDEF elements returned from the SimplifyDemandedVectorElts calls to the shuffle operands, we can attempt to simplify the shuffle mask. I had to be very conservative here as accepting post-legalized shuffle masks could cause problems for targets that legalize UNDEF mask elements back to inrange values (PowerPC), similarly combining to identity shuffle masks could cause too much UNDEF information to disappear for later combines. llvm-svn: 325354
* [InstCombine] reduce code duplication; NFCSanjay Patel2018-02-161-31/+19
| | | | llvm-svn: 325353
* [X86][SSE] Allow float domain crossing if we are merging 2 or more shuffles ↵Simon Pilgrim2018-02-161-0/+1
| | | | | | and the root started as a float domain shuffle llvm-svn: 325349
* [PowerPC] Fix transform in table gen file causing UBNemanja Ivanovic2018-02-161-1/+1
| | | | | | | | Running a bootstrap build with UBSan produces a number of instances where we have signed integer overflow due to this transform. Change the type to long to prevent this UB on 64-bit build machines. llvm-svn: 325347
* [mips] Remove codegen support from some 16 bit instructionsSimon Dardis2018-02-161-12/+5
| | | | | | | | | | | | These instructions conflict with their full length variants for the purposes of FastISel as they cannot be distingushed based on the number and type of operands and predicates. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D41285 llvm-svn: 325341
* [SelectionDAG] Add initial SimplifyDemandedVectorElts support for ↵Simon Pilgrim2018-02-161-0/+20
| | | | | | | | simplifying VSELECT operands This just adds a basic pass through - we can add constant selection mask handling in a future patch to fully match InstCombine. llvm-svn: 325338
* [Transforms] Propagate TBAA info in SROAIvan A. Kosarev2018-02-161-22/+61
| | | | | | | | | | | | | | | Now that we have the new TBAA metadata format that is capable of representing accesses to aggregates, we can propagate TBAA access tags from memory setting and transferring intrinsics to load and store instructions and vice versa. Since SROA produces lots of new loads and stores on optimized builds, this change significantly decreases the share of undecorated memory accesses on such builds. Differential Revision: https://reviews.llvm.org/D41563 llvm-svn: 325329
* [ARM] Return true in enableMultipleCopyHints().Jonas Paulsson2018-02-161-0/+1
| | | | | | | | | | Enable multiple COPY hints to eliminate more COPYs during register allocation. Note that this is something all targets should do, see https://reviews.llvm.org/D38128. Review: Eli Friedman llvm-svn: 325327
* [LegalizeDAG] Fix legalization of SETCCMikhail Maltsev2018-02-161-1/+1
| | | | | | | | | | | | | | | | | | | Summary: Currently when expanding a SETCC node into a SELECT_CC, LLVM uses an incorrect type for determining BooleanContent of the result. This patch fixes the issue. Fixes PR36079. Reviewers: rogfer01, javed.absar, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43282 llvm-svn: 325325
* [ARM] Materialise some boolean values to avoid a branchRoger Ferrer Ibanez2018-02-161-10/+89
| | | | | | | | | | | | | This patch combines some cases of ARMISD::CMOV for integers that arise in comparisons of the form a != b ? x : 0 a == b ? 0 : x and that currently (e.g. in Thumb1) are emitted as branches. Differential Revision: https://reviews.llvm.org/D34515 llvm-svn: 325323
* [ThinLTO] Import global variablesEugene Leviant2018-02-162-20/+88
| | | | | | Differential revision: https://reviews.llvm.org/D43077 llvm-svn: 325320
* [X86] Allow CMOVs of constants to be sign extended from i32.Craig Topper2018-02-161-4/+5
| | | | | | Sign extending i32 constants only requires a REX prefix as does widening the CMOV. This is cheaper than the explicit sign extend op. llvm-svn: 325318
* [X86] Don't zero_extend cmov up to i64, stop at i32.Craig Topper2018-02-161-11/+26
| | | | | | Zero extend from i32 to i64 is free. So extend from i16 to i32, and then use a free zero extend to finish. llvm-svn: 325317
* [APInt] Fix extractBits to correctly handle Result.isSingleWord() case.Tim Shen2018-02-161-1/+2
| | | | | | | | | | | | Summary: extractBits assumes that `!this->isSingleWord() implies !Result.isSingleWord()`, which may not necessarily be true. Handle both cases. Reviewers: RKSimon Subscribers: sanjoy, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D43363 llvm-svn: 325311
* [GVN] Partially revert debug info salvage change (r325063)Vedant Kumar2018-02-161-1/+0
| | | | | | | | | | | | | In r325063, we salvaged debug values from dying instructions in GVN::processBlock() and GVN::performScalarPRE(). The change in performScalarPRE(), while correct, is unhelpful. It introduced a call to salvageDebugInfo() which was immediately followed by a RAUW, meaning it prevented the RAUW from efficiently updating dbg.value intrinsics. This commit reverts the mistake and tightens up the affected test case. llvm-svn: 325308
* [DCE] Salvage debug info from dead instsVedant Kumar2018-02-151-0/+3
| | | | | | | This results in small increases in the size of the .debug_loc section and the number of unique source variables in a stage2 build of opt. llvm-svn: 325301
* [AMDGPU] Combine adjacent waitcounts in a single strongest waitStanislav Mekhanoshin2018-02-151-7/+44
| | | | | | Differential Revision: https://reviews.llvm.org/D43350 llvm-svn: 325299
* [X86][3DNOW] Teach decoder about AMD 3DNow! instrsRafael Auler2018-02-154-11/+50
| | | | | | | | | | | | | | | | | | | Summary: This patch makes the decoder understand old AMD 3DNow! instructions that have never been properly supported in the X86 disassembler, despite being supported in other subsystems. Hopefully this should make the X86 decoder more complete with respect to binaries containing legacy code. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits, maksfb, bruno Differential Revision: https://reviews.llvm.org/D43311 llvm-svn: 325295
* [X86] Enable BT to be used in place of TEST for single bit checks under optsizeCraig Topper2018-02-151-6/+9
| | | | | | | | We already do this for 64-bit when it won't fit into a 64-bit AND/TEST's immediate field. This adds an additional qualifier to do it for any single bit constant larger than 8-bits under optsize Differential Revision: https://reviews.llvm.org/D43346 llvm-svn: 325290
* [DAGCombiner] Call ExtendUsesToFormExtLoad in (zext (and (load)))->(and ↵Craig Topper2018-02-151-13/+14
| | | | | | | | | | | | | | (zextload)) even when the and does not have multiple uses Same for the sign extend case. Currently we check for multiple uses on the binop. Then we call ExtendUsesToFormExtLoad to capture SetCCs that use the load. So we only end up finding any setccs when the and has additional uses and the load is used by a setcc. I don't think the and having multiple uses is relevant here. I think we should only be checking for the load having multiple uses. This changes an NVPTX test because we now find that the load has a second use by a truncate, but ExtendUsesToFormExtLoad only looks at setccs it can extend. All other operations just check isTruncateFree. Maybe we should allow widening of an existing truncate even if its not free? Differential Revision: https://reviews.llvm.org/D43063 llvm-svn: 325289
* [X86] Use btc/btr/bts to implement xor/and/or that affects a single bit in ↵Craig Topper2018-02-153-1/+37
| | | | | | | | | | | | the upper 32-bits of a 64-bit operation. We can't fold a large immediate into a 64-bit operation. But if we know we're only operating on a single bit we can use the bit instructions. For now only do this for optsize. Differential Revision: https://reviews.llvm.org/D37418 llvm-svn: 325287
* [Coroutines] Don't move stores for allocator argsBrian Gesiak2018-02-151-1/+16
| | | | | | | | | | | | | | | | | | | | | Summary: The behavior described in Coroutines TS `[dcl.fct.def.coroutine]/7` allows coroutine parameters to be passed into allocator functions. The instructions to store values into the alloca'd parameters must not be moved past the frame allocation, otherwise uninitialized values are passed to the allocator. Test Plan: `check-llvm` Reviewers: rsmith, GorNishanov, eric_niebler Reviewed By: GorNishanov Subscribers: compnerd, EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D43000 llvm-svn: 325285
* [Utils] salvageDI: Add a comment and move a call earlier, NFCVedant Kumar2018-02-151-1/+3
| | | | llvm-svn: 325280
* Silence warning about unused private variable.Zachary Turner2018-02-151-1/+3
| | | | llvm-svn: 325275
* Call FlushFileBuffers on output files.Zachary Turner2018-02-151-1/+15
| | | | | | | | | | | | There is a latent Windows kernel bug, the exact trigger conditions are not well understood, which can cause a file to be correctly written, but unable to be correctly read. The workaround appears to be simply calling FlushFileBuffers. Differential Revision: https://reviews.llvm.org/D42925 llvm-svn: 325274
* Recommit [Hexagon] Make the vararg handling a bit more robustKrzysztof Parzyszek2018-02-151-19/+11
| | | | | | | Use the FunctionType of the callee when it's available. It may not be available for synthetic calls to functions specified by external symbols. llvm-svn: 325269
* bpf: fix a bug in dag2dag optimization for loads from readonly sectionYonghong Song2018-02-151-4/+4
| | | | | | | | | | | | | | | | | | | | | The reference '&' is missing in the function parameter. If there are back-to-back optimizations in terms of dag node list like below: t29: i64,ch = load<LD4[bitcast (%struct.test_t* @test.t to i8*)+12](dereferenceable), zext from i32> t3, t43, undef:i64 t34: i64,ch = load<LD4[bitcast (%struct.test_t* @test.t to i8*)](dereferenceable), zext from i32> t3, t41, undef:i64 The bug will trigger a segfault for the added test case remove_truncate_5.ll: LLVMSymbolizer: error reading file: No such file or directory #0 0x000000000241c4d9 (llc+0x241c4d9) #1 0x000000000241c56a (llc+0x241c56a) #2 0x000000000241aa50 (llc+0x241aa50) ... #22 0x0000000000fd5edf (llc+0xfd5edf) #23 0x00007f0fe03bec05 __libc_start_main (/lib64/libc.so.6+0x21c05) #24 0x0000000000fd3e69 (llc+0xfd3e69) ... Segmentation fault Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 325267
* Revert "[Hexagon] Make the vararg handling a bit more robust"Krzysztof Parzyszek2018-02-151-8/+19
| | | | | | This is breaking lit tests. llvm-svn: 325266
* [InstCombine] use m_OneUse to reduce code; NFCSanjay Patel2018-02-151-2/+2
| | | | llvm-svn: 325263
* [Hexagon] Make the vararg handling a bit more robustKrzysztof Parzyszek2018-02-151-19/+8
| | | | | | | The FunctionType of the callee is always available, even if the Function of the callee is not. Use that to get the number of fixed parameters. llvm-svn: 325259
* [CodeGen] Separate MBB metadata from instructions in -debug printingFrancis Visoiu Mistrih2018-02-151-1/+9
| | | | | | | Add an empty line after 'liveins:', 'successors:', or '; predecessors:', the one that ends up to be the last one. llvm-svn: 325258
OpenPOWER on IntegriCloud