summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [GISel][CallLowering] Enable vector support in argument loweringQuentin Colombet2019-10-111-4/+2
| | | | | | | | | | The exciting code is actually already enough to handle the splitting of vector arguments but we were lacking a test case. This commit adds a test case for vector argument lowering involving splitting and enable the related support in call lowering. llvm-svn: 374589
* [MachineIRBuilder] Fix an assertion failure with buildMergeQuentin Colombet2019-10-111-2/+5
| | | | | | | | | | | | | | | | Teach buildMerge how to deal with scalar to vector kind of requests. Prior to this patch, buildMerge would issue either a G_MERGE_VALUES when all the vregs are scalars or a G_CONCAT_VECTORS when the destination vreg is a vector. G_CONCAT_VECTORS was actually not the proper instruction when the source vregs were scalars and the compiler would assert that the sources must be vectors. Instead we want is to issue a G_BUILD_VECTOR when we are in this situation. This patch fixes that. llvm-svn: 374588
* llvm-dwarfdump: Add verbose printing for debug_loclistsDavid Blaikie2019-10-113-39/+111
| | | | llvm-svn: 374582
* [X86][SSE] Add support for v4i8 add reductionSimon Pilgrim2019-10-111-2/+7
| | | | llvm-svn: 374579
* [AArch64][SVE] Implement sdot and udot (lane) intrinsicsKerry McLaughlin2019-10-113-22/+35
| | | | | | | | | | | | | | | | | | | | | Summary: Implements the following arithmetic intrinsics: - int_aarch64_sve_sdot - int_aarch64_sve_sdot_lane - int_aarch64_sve_udot - int_aarch64_sve_udot_lane This patch includes tests for the Subdivide4Argument type added by D67549 Reviewers: sdesmalen, SjoerdMeijer, greened, rengolin, rovka Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D67551 llvm-svn: 374566
* [VPlan] Add moveAfter to VPRecipeBase.Florian Hahn2019-10-112-0/+10
| | | | | | | | | | | | | This patch adds a moveAfter method to VPRecipeBase, which can be used to move elements after other elements, across VPBasicBlocks, if necessary. Reviewers: dcaballe, hsaito, rengolin, hfinkel Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D46825 llvm-svn: 374565
* [AIX] Use .space instead of .zero in assemblyDavid Tenty2019-10-111-0/+1
| | | | | | | | | | | | | | Summary: The AIX system assembler does not understand .zero, so we should prefer emitting .space. Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68815 llvm-svn: 374564
* [AMDGPU][MC][GFX9][GFX10] Corrected number of src operands for ↵Dmitry Preobrazhensky2019-10-111-5/+18
| | | | | | | | | | | | ds_[read/write]_addtid_b32 See https://bugs.llvm.org/show_bug.cgi?id=37941 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68787 llvm-svn: 374561
* [AMDGPU][MC][GFX6][GFX7][GFX10] Added instructions ↵Dmitry Preobrazhensky2019-10-111-14/+29
| | | | | | | | | | | | buffer_atomic_[fcmpswap/fmin/fmax]* See https://bugs.llvm.org/show_bug.cgi?id=28232 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68788 llvm-svn: 374559
* [AMDGPU][MC][GFX10] Enabled null for 64-bit dst operandsDmitry Preobrazhensky2019-10-111-0/+12
| | | | | | | | | | See https://bugs.llvm.org/show_bug.cgi?id=43524 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D68785 llvm-svn: 374557
* [DAGCombiner] fold vselect-of-constants to shiftSanjay Patel2019-10-111-0/+9
| | | | | | | | | | The diffs suggest that we are missing some more basic analysis/transforms, but this keeps the vector path in sync with the scalar (rL374397). This is again a preliminary step for introducing the reverse transform in IR as proposed in D63382. llvm-svn: 374555
* Fix compilation warnings. NFC.Michael Liao2019-10-112-2/+2
| | | | llvm-svn: 374554
* [AMDGPU][MC] Corrected parsing of optional operandsDmitry Preobrazhensky2019-10-111-12/+6
| | | | | | | | | | See https://bugs.llvm.org/show_bug.cgi?id=43486 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D68350 llvm-svn: 374553
* [mips] Fix loading "double" immediate into a GPR and FPRSimon Atanasyan2019-10-111-6/+14
| | | | | | | | | | | | | | | If a "double" (64-bit) value has zero low 32-bits, it's possible to load such value into a GP/FP registers as an instruction immediate. But now assembler loads only high 32-bits of the value. For example, if a target register is GPR the `li.d $4, 1.0` instruction converts into the `lui $4, 16368` one. As a result, we get `0x3FF00000` in the register. While a correct representation of the `1.0` value is `0x3FF0000000000000`. The patch fixes that. Differential Revision: https://reviews.llvm.org/D68776 llvm-svn: 374544
* Dead Virtual Function EliminationOliver Stannard2019-10-116-36/+215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 374539
* [FileCheck] Implement --ignore-case option.Kai Nacke2019-10-112-2/+10
| | | | | | | | | | | | The FileCheck utility is enhanced to support a `--ignore-case` option. This is useful in cases where the output of Unix tools differs in case (e.g. case not specified by Posix). Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D68146 llvm-svn: 374538
* [SCEV] Add stricter verification option.Florian Hahn2019-10-111-5/+8
| | | | | | | | | | | | | | | | | | | | | | Currently -verify-scev only fails if there is a constant difference between two BE counts. This misses a lot of cases. This patch adds a -verify-scev-strict options, which fails for any non-zero differences, if used together with -verify-scev. With the stricter checking, some unit tests fail because of mis-matches, especially around IndVarSimplify. If there is no reason I am missing for just checking constant deltas, I am planning on looking into the various failures. Reviewers: efriedma, sanjoy.google, reames, atrick Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D68592 llvm-svn: 374535
* [X86] isFNEG - add recursion depth limitSimon Pilgrim2019-10-111-5/+9
| | | | | | Now that its used by isNegatibleForFree we should try to avoid costly deep recursion llvm-svn: 374534
* Insert module constructors in a module passVitaly Buka2019-10-113-45/+59
| | | | | | | | | | | | | | | | | | | | | Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 > llvm-svn: 374481 Signed-off-by: Vitaly Buka <vitalybuka@google.com> llvm-svn: 374527
* [PowerPC] Remove assertion "Shouldn't overwrite a register before it is killed"Yi-Hong Lyu2019-10-111-8/+9
| | | | | | | | | | | | | | The assertion is everzealous and fail tests like: renamable $x3 = LI8 0 STD renamable $x3, 16, $x1 renamable $x3 = LI8 0 Remove the assertion since killed flag of $x3 is not mandentory. Differential Revision: https://reviews.llvm.org/D68344 llvm-svn: 374515
* [InstCombine] recognize popcount.Chen Zheng2019-10-111-0/+67
| | | | | | | | | This patch recognizes popcount intrinsic according to algorithm from website http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel Differential Revision: https://reviews.llvm.org/D68189 llvm-svn: 374512
* [X86] Add a DAG combine to turn v16i16->v16i8 VTRUNCUS+store into a ↵Craig Topper2019-10-111-0/+13
| | | | | | saturating truncating store. llvm-svn: 374509
* [CVP] Remove a masking operation if range information implies it's a noopPhilip Reames2019-10-111-0/+27
| | | | | | | | | | This is really a known bits style transformation, but known bits isn't context sensitive. The particular case which comes up happens to involve a range which allows range based reasoning to eliminate the mask pattern, so handle that case specifically in CVP. InstCombine likes to generate the mask-by-low-bits pattern when widening an arithmetic expression which includes a zext in the middle. Differential Revision: https://reviews.llvm.org/D68811 llvm-svn: 374506
* Revert 374481 "[tsan,msan] Insert module constructors in a module pass"Nico Weber2019-10-113-59/+45
| | | | | | | CodeGen/sanitizer-module-constructor.c fails on mac and windows, see e.g. http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/11424 llvm-svn: 374503
* [JITLink] Fix MachO/arm64 GOTPAGEOFF encoding.Lang Hames2019-10-111-2/+5
| | | | | | | | The original implementation failed to shift the immediate down. This should fix some of the bot failures due to r374476. llvm-svn: 374499
* [Attributor][FIX] Do not replace musstail calls with constantJohannes Doerfert2019-10-111-1/+1
| | | | llvm-svn: 374498
* AMDGPU: Move SelectFlatOffset back into AMDGPUISelDAGToDAGMatt Arsenault2019-10-113-62/+43
| | | | llvm-svn: 374495
* [Stats] Add ALWAYS_ENABLED_STATISTIC enabled regardless of LLVM_ENABLE_STATS.Volodymyr Sapsai2019-10-111-14/+13
| | | | | | | | | | | | | | | | | | | | | The intended usage is to measure relatively expensive operations. So the cost of the statistic is negligible compared to the cost of a measured operation and can be enabled all the time without impairing the compilation time. rdar://problem/55715134 Reviewers: dsanders, bogner, rtereshin Reviewed By: dsanders Subscribers: hiraditya, jkorous, dexonsmith, ributzka, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68252 llvm-svn: 374490
* [X86] Improve the AVX512 bailout in combineTruncateWithSat to allow pack ↵Craig Topper2019-10-111-2/+9
| | | | | | | | | | instructions in more situations. If we don't have VLX we won't end up selecting a saturating truncate for 256-bit or smaller vectors so we should just use the pack lowering. llvm-svn: 374487
* [tsan,msan] Insert module constructors in a module passVitaly Buka2019-10-103-45/+59
| | | | | | | | | | | | | | | | | | Summary: If we insert them from function pass some analysis may be missing or invalid. Fixes PR42877. Reviewers: eugenis, leonardchan Reviewed By: leonardchan Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68832 llvm-svn: 374481
* [msan, NFC] Move option parsing into constructorVitaly Buka2019-10-101-10/+12
| | | | llvm-svn: 374480
* Fix compilation warning due to typo.Michael Liao2019-10-101-1/+1
| | | | llvm-svn: 374479
* [JITLink] Add an initial implementation of JITLink for MachO/AArch64.Lang Hames2019-10-103-0/+737
| | | | | | | | | This implementation has support for all relocation types except TLV. Compact unwind sections are not yet supported, so exceptions/unwinding will not work. llvm-svn: 374476
* [MemorySSA] Update Phi simplification.Alina Sbirlea2019-10-101-5/+12
| | | | | | | | | | When simplifying a Phi to the unique value found incoming, check that there wasn't a Phi already created to break a cycle. If so, remove it. Resolves PR43541. Some additional nits included. llvm-svn: 374471
* [GISel] Simplifying return from else in function. NFCMarcello Maggioni2019-10-101-2/+1
| | | | | | Forgot to integrate this little change in previous commit llvm-svn: 374463
* [X86] Guard against leaving a dangling node in combineTruncateWithSat.Craig Topper2019-10-101-4/+13
| | | | | | | | | | | | | | | | | | When handling the packus pattern for i32->i8 we do a two step process using a packss to i16 followed by a packus to i8. If the final i8 step is a type with less than 64-bits the packus step will return SDValue(), but the i32->i16 step might have succeeded. This leaves the nodes from the middle step dangling. Guard against this by pre-checking that the number of elements is at least 8 before doing the middle step. With that check in place this should mean the only other case the middle step itself can fail is when SSE2 is disabled. So add an early SSE2 check then just assert that neither the middle or final step ever fail. llvm-svn: 374460
* [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values.Marcello Maggioni2019-10-101-11/+31
| | | | | | | | | | | | | | | | | | | | | In GISel we have both G_CONSTANT and G_FCONSTANT, but because in GISel we don't really have a concept of Float vs Int value the only difference between the two is where the data originates from. What both G_CONSTANT and G_FCONSTANT return is just a bag of bits with the constant representation in it. By making getConstantVRegVal() return G_FCONSTANTs bit representation as well we allow ConstantFold and other things to operate with G_FCONSTANT. Adding tests that show ConstantFolding to work on mixed G_CONSTANT and G_FCONSTANT sources. Differential Revision: https://reviews.llvm.org/D68739 llvm-svn: 374458
* [AMDGPU] Handle undef old operand in DPP combineStanislav Mekhanoshin2019-10-101-1/+3
| | | | | | | | It was missing an undef flag. Differential Revision: https://reviews.llvm.org/D68813 llvm-svn: 374455
* [ValueTracking] Improve pointer offset computation for cases of same baseRong Xu2019-10-101-9/+39
| | | | | | | | | | | | This patch improves the handling of pointer offset in GEP expressions where one argument is the base pointer. isPointerOffset() is being used by memcpyopt where current code synthesizes consecutive 32 bytes stores to one store and two memset intrinsic calls. With this patch, we convert the stores to one memset intrinsic. Differential Revision: https://reviews.llvm.org/D67989 llvm-svn: 374454
* [InstCombine] Add test case for PR43617 (NFC)Evandro Menezes2019-10-101-3/+1
| | | | | | Also, refactor check in `LibCallSimplifier::optimizeLog()`. llvm-svn: 374453
* [MemorySSA] Additional handling of unreachable blocks.Alina Sbirlea2019-10-101-0/+4
| | | | | | | | | | | | | | | | | | | | | Summary: Whenever we get the previous definition, the assumption is that the recursion starts ina reachable block. If the recursion starts in an unreachable block, we may recurse indefinitely. Handle this case by returning LoE if the block is unreachable. Resolves PR43426. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68809 llvm-svn: 374447
* [System Model] [TTI] Move default cache/prefetch implementationsDavid Greene2019-10-101-28/+0
| | | | | | | | | | Move the default implementations of cache and prefetch queries to TargetTransformInfoImplBase and delete them from NoTIIImpl. This brings these interfaces in line with how other TTI interfaces work. Differential Revision: https://reviews.llvm.org/D68804 llvm-svn: 374446
* [PDB] Fix bug when using multiple PCH header objects with the same name.Zachary Turner2019-10-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | A common pattern in Windows is to have all your precompiled headers use an object named stdafx.obj. If you've got a project with many different static libs, you might use a separate PCH for each one of these. During the final link step, a file from A might reference the PCH object from A, but it will have the same name (stdafx.obj) as any other PCH from another project. The only difference will be the path. For example, A might be A/stdafx.obj while B is B/stdafx.obj. The existing algorithm checks only the filename that was passed on the command line (or stored in archive), but this is insufficient in the case where relative paths are used, because depending on the command line object file / library order, it might find the wrong PCH object first resulting in a signature mismatch. The fix here is to simply check whether the absolute path of the PCH object (which is stored in the input obj file for the file that references the PCH) *ends with* the full relative path of whatever is specified on the command line (or is in the archive). Differential Revision: https://reviews.llvm.org/D66431 llvm-svn: 374442
* ADT: Save a word in every StringSet entryJordan Rose2019-10-101-1/+1
| | | | | | | | | | | | | | | Add a specialization to StringMap (actually StringMapEntry) for a value type of NoneType (the type of llvm::None), and use it for StringSet. This'll save us a word from every entry in a StringSet, used for alignment with the size_t that stores the string length. I could have gone all the way to some kind of empty base class optimization, but that seemed like overkill. Someone can consider adding that in the future, though. https://reviews.llvm.org/D68586 llvm-svn: 374440
* [X86] Use packusdw+vpmovuswb to implement v16i32->V16i8 that clamps signed ↵Craig Topper2019-10-101-0/+15
| | | | | | | | | | inputs to be between 0 and 255 when zmm registers are disabled on SKX. If we've disable zmm registers, the v16i32 will need to be split. This split will propagate through min/max the truncate. This creates two sequences that need to be concatenated back to v16i8. We can instead use packusdw to do part of the clamping, truncating, and concatenating all at once. Then we can use a vpmovuswb to finish off the clamp. Differential Revision: https://reviews.llvm.org/D68763 llvm-svn: 374431
* win: Move Parallel.h off concrt to cross-platform codeNico Weber2019-10-101-30/+1
| | | | | | | | | | | | | | | | | | | | | r179397 added Parallel.h and implemented it terms of concrt in 2013. In 2015, a cross-platform implementation of the functions has appeared and is in use everywhere but on Windows (r232419). r246219 hints that <thread> had issues in MSVC2013, but r296906 suggests they've been fixed now that we require 2015+. So remove the concrt code. It's less code, and it sounds like concrt has conceptual and performance issues, see PR41198. I built blink_core.dll in a debug component build with full symbols and in a release component build without any symbols. I couldn't measure a performance difference for linking blink_core.dll before and after this patch. Differential Revision: https://reviews.llvm.org/D68820 llvm-svn: 374421
* [NFC][PowerPC]Clean up PPCAsmPrinter for TOC related pseudo opcodeXiangling Liao2019-10-101-93/+70
| | | | | | | | | Add a helper function getMCSymbolForTOCPseudoMO to clean up PPCAsmPrinter a little bit. Differential Revision: https://reviews.llvm.org/D68721 llvm-svn: 374420
* Print quoted backslashes in LLVM IR as \\ instead of \5CReid Kleckner2019-10-101-1/+3
| | | | | | | | | This improves readability of Windows path string literals in LLVM IR. The LLVM assembler has supported \\ in IR strings for a long time, but the lexer doesn't tolerate escaped quotes, so they have to be printed as \22 for now. llvm-svn: 374415
* Fix Windows build after r374381Nico Weber2019-10-102-7/+2
| | | | llvm-svn: 374413
* Remove strings.h include to fix GSYM Windows buildReid Kleckner2019-10-101-1/+0
| | | | | | Fifth time's the charm. llvm-svn: 374411
OpenPOWER on IntegriCloud