summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
...
* [NVPTX] Implemented wmma intrinsics and instructions.Artem Belevich2017-10-121-0/+201
| | | | | | | | | | WMMA = "Warp Level Matrix Multiply-Accumulate". These are the new instructions introduced in PTX6.0 and available on sm_70 GPUs. Differential Revision: https://reviews.llvm.org/D38645 llvm-svn: 315601
* [codeview] Don't emit FPO data in funclet prologuesReid Kleckner2017-10-121-0/+85
| | | | | | Attempt 3 to work around bugs in FPO data with funclets. llvm-svn: 315600
* llvm-isel-fuzzer: Work around BUILD_SHARED_LIBS testing issuesJustin Bogner2017-10-123-0/+10
| | | | | | | | | | | | | | | Building with BUILD_SHARED_LIBS makes it tricky to copy around executables at will, since they won't be able to find the LLVM libraries any more. This makes testing a feature that's based on the executable name problematic, so we'll just disable these two tests in that configuration. We could potentially fix this by symlinking the lib directory into the test directory, but that wouldn't work on windows, and losing testing on windows would be far worse than losing testing on a configuration that's barely even supported. llvm-svn: 315599
* [TableGen] Allow intrinsics to have up to 8 return values.Artem Belevich2017-10-121-0/+32
| | | | | | Differential Revision: https://reviews.llvm.org/D38633 llvm-svn: 315598
* [ValueTracking] return zero when there's conflict in known bits of a shift ↵Sanjay Patel2017-10-122-28/+7
| | | | | | | | (PR34838) Poison allows us to return a better result than undef. llvm-svn: 315595
* Reintroduce "[SCCP] Propagate integer range info for parameters in IPSCCP."Bruno Cardoso Lopes2017-10-121-0/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is r315288 & r315294, which were reverted due to stage2 bot failures. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315593
* [PowerPC] Add profitablilty check for conversion to mtctr loopsLei Huang2017-10-122-5/+158
| | | | | | | | | | | | | | | Add profitability checks for modifying counted loops to use the mtctr instruction. The latency of mtctr is only justified if there are more than 4 comparisons that will be removed as a result. Usually counted loops are formed relatively early and before unrolling, so most low trip count loops often don't survive. However we want to ensure that if they do, we do not mistakenly update them to mtctr loops. Use CodeMetrics to ensure we are only doing this for small loops with small trip counts. Differential Revision: https://reviews.llvm.org/D38212 llvm-svn: 315592
* [AMDGPU] For amdpal, widen interpolation mode workaroundTim Renouf2017-10-121-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The interpolation mode workaround ensures that at least one interpolation mode is enabled in PSInputAddr. It does not also check PSInputEna on the basis that the user might enable bits in that depending on run-time state. However, for amdpal os type, the user does not enable some bits after compilation based on run-time states; the register values being generated here are the final ones set in the hardware. Therefore, apply the workaround to PSInputAddr and PSInputEnable together. (The case where a bit is set in PSInputAddr but not in PSInputEnable is where the frontend set up an input arg for a particular interpolation mode, but nothing uses that input arg. Really we should have an earlier pass that removes such an arg.) Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37758 llvm-svn: 315591
* [RegisterCoalescer] Don't set read-undef in pruneValues, only clearMikael Holmen2017-10-121-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The comments in the code said // Remove <def,read-undef> flags. This def is now a partial redef. but the code didn't just remove read-undef, it could introduce new ones which could cause errors. E.g. if we have something like %vreg1<def> = IMPLICIT_DEF %vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4 %vreg2:subreg2<def> = op %vreg6, %vreg7 and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg def, which the old code did. Now we solve this by actually do what the code comment says. We remove read-undef flags rather than remove or introduce them. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38616 llvm-svn: 315564
* Re-commit "llvm-isel-fuzzer: Handle a subset of backend flags in the exec name"Justin Bogner2017-10-122-0/+30
| | | | | | | | | | | | | | | | | | Here we add a secondary option parser to llvm-isel-fuzzer (and provide it for use with other fuzzers). With this, you can copy the fuzzer to a name like llvm-isel-fuzzer=aarch64-gisel for a fuzzer that fuzzer AArch64 with GlobalISel enabled, or fuzzer=x86_64 to fuzz x86, with no flags required. This should be useful for running these in OSS-Fuzz. Note that this handrolls a subset of cl::opts to recognize, rather than embedding a complete command parser for argv[0]. If we find we really need the flexibility of handling arbitrary options at some point we can rethink this. This re-applies 315545 using "=" instead of ":" as a separator for arguments. llvm-svn: 315557
* Revert r315545 "llvm-isel-fuzzer: Handle a subset of backend flags in the ↵Hans Wennborg2017-10-122-30/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | executable name" It broke some tests on Windows: Failing Tests (4): LLVM :: tools/llvm-isel-fuzzer/execname-options.ll LLVM :: tools/llvm-isel-fuzzer/missing-triple.ll LLVM :: tools/llvm-isel-fuzzer/x86-empty-bc.ll LLVM :: tools/llvm-isel-fuzzer/x86-empty.ll > llvm-isel-fuzzer: Handle a subset of backend flags in the executable name > > Here we add a secondary option parser to llvm-isel-fuzzer (and provide > it for use with other fuzzers). With this, you can copy the fuzzer to > a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer > AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no > flags required. This should be useful for running these in OSS-Fuzz. > > Note that this handrolls a subset of cl::opts to recognize, rather > than embedding a complete command parser for argv[0]. If we find we > really need the flexibility of handling arbitrary options at some > point we can rethink this. llvm-svn: 315554
* [SimplifyIndVar] Replace IVUsers with loop invariant whenever possibleHongbin Zheng2017-10-122-39/+88
| | | | | | Differential Revision: https://reviews.llvm.org/D38415 llvm-svn: 315551
* llvm-isel-fuzzer: Handle a subset of backend flags in the executable nameJustin Bogner2017-10-122-0/+30
| | | | | | | | | | | | | | | Here we add a secondary option parser to llvm-isel-fuzzer (and provide it for use with other fuzzers). With this, you can copy the fuzzer to a name like llvm-isel-fuzzer:aarch64-gisel for a fuzzer that fuzzer AArch64 with GlobalISel enabled, or fuzzer:x86_64 to fuzz x86, with no flags required. This should be useful for running these in OSS-Fuzz. Note that this handrolls a subset of cl::opts to recognize, rather than embedding a complete command parser for argv[0]. If we find we really need the flexibility of handling arbitrary options at some point we can rethink this. llvm-svn: 315545
* Revert r307036 because of PR34919.Wei Mi2017-10-127-743/+1589
| | | | llvm-svn: 315540
* AMDGPU/NFC: Minor clean ups in PAL metadataKonstantin Zhuravlyov2017-10-118-9/+9
| | | | | | | | | - Move PAL metadata definitions to AMDGPUMetadata - Make naming consistent with HSA metadata Differential Revision: https://reviews.llvm.org/D38745 llvm-svn: 315523
* AMDGPU/NFC: Rename code object metadata as HSA metadataKonstantin Zhuravlyov2017-10-1114-27/+27
| | | | | | | | | - Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change) - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer - Introduce HSAMD namespace - Other minor name changes in function and test names llvm-svn: 315522
* Really fix llvm-rc include-paths.testReid Kleckner2017-10-111-3/+3
| | | | llvm-svn: 315515
* Attempt to fix failing llvm-rc include-paths.textReid Kleckner2017-10-111-1/+1
| | | | llvm-svn: 315514
* [codeview] Implement FPO data assembler directivesReid Kleckner2017-10-1111-5/+1753
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds a set of new directives that describe 32-bit x86 prologues. The directives are limited and do not expose the full complexity of codeview FPO data. They are merely a convenience for the compiler to generate more readable assembly so we don't need to generate tons of labels in CodeGen. If our prologue emission changes in the future, we can change the set of available directives to suit our needs. These are modelled after the .seh_ directives, which use a different format that interacts with exception handling. The directives are: .cv_fpo_proc _foo .cv_fpo_pushreg ebp/ebx/etc .cv_fpo_setframe ebp/esi/etc .cv_fpo_stackalloc 200 .cv_fpo_endprologue .cv_fpo_endproc .cv_fpo_data _foo I tried to follow the implementation of ARM EHABI CFI directives by sinking most directives out of MCStreamer and into X86TargetStreamer. This helps avoid polluting non-X86 code with WinCOFF specific logic. I used cdb to confirm that this can show locals in parent CSRs in a few cases, most importantly the one where we use ESI as a frame pointer, i.e. the one in http://crbug.com/756153#c28 Once we have cdb integration in debuginfo-tests, we can add integration tests there. Reviewers: majnemer, hans Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38776 llvm-svn: 315513
* [Hexagon] Make sure that new-value jump is packetized with producerKrzysztof Parzyszek2017-10-111-0/+31
| | | | llvm-svn: 315510
* [MachineCombiner] Fix initialisation of LastUpdate for incremental update.Florian Hahn2017-10-111-0/+48
| | | | | | | | | | | | | | | | | Summary: Fixes a bogus iterator resulting from the removal of a block's first instruction at the point that incremental update is enabled. Patch by Paul Walker. Reviewers: fhahn, Gerolf, efriedma, MatzeB Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D38734 llvm-svn: 315502
* [PowerPC] Utilize DQ-Form instructions for spill/restore and fix FrameIndex ↵Lei Huang2017-10-114-238/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | elimination to only use `lis/addi` if necessary. Currently we produce a bunch of unnecessary code when emitting the prologue/epilogue for spills/restores. Namely, if the load from stack slot/store to stack slot instruction is an X-Form instruction, we will always produce an LIS/ORI sequence for the stack offset. Furthermore, we have not exploited the P9 vector D-Form loads/stores for this purpose. This patch address both issues. Specifying the D-Form load as the instruction to use for stack spills/reloads should be safe because: 1. The stack should be aligned according to the ABI 2. If the stack isn't aligned, PPCRegisterInfo::eliminateFrameIndex() will check for the offset being a multiple of 16 and will convert it to an X-Form instruction if it isn't. Differential Revision : https://reviews.llvm.org/D38758 llvm-svn: 315500
* [llvm-rc] Use proper search algorithm for finding resources.Zachary Turner2017-10-115-6/+50
| | | | | | | | | | | | Previously we would only look in the current directory for a resource, which might not be the same as the directory of the rc file. Furthermore, MSVC rc supports a /I option, and can also look in the system environment. This patch adds support for this search algorithm. Differential Revision: https://reviews.llvm.org/D38740 llvm-svn: 315499
* [x86] avoid infinite loop from SoftenFloatOperand (PR34866)Sanjay Patel2017-10-111-0/+65
| | | | | | | | | Legalization of fp128 assumes things that we should have asserts for, so that's another potential improvement. Differential Revision: https://reviews.llvm.org/D38771 llvm-svn: 315485
* Reland "[llvm-objcopy] Add support for --strip-sections to remove all ↵Jake Ehrlich2017-10-111-0/+66
| | | | | | | | | | | | | | | | | | | | | section headers leaving only program headers and loadable segment data" ubsan caught an issue I made where I was converting a null pointer to a reference. elf utils implements a particularly extreme form of stripping that I'd like to support. eu-strip has an option called "strip-sections" that removes all section headers and leaves only program headers and the segment data. I have implemented this option partly as a test but mainly because in Fuchsia we would like to use this option to minimize the size of our executables. The other strip options that are on my list include --strip-all and --strip-debug. This is a preliminary implementation that I'd like to start using in Fuchsia builds if possible. This change implements such a stripping option for llvm-objcopy Differential Revision: https://reviews.llvm.org/D38335 llvm-svn: 315484
* [NFC] update test case so checks are not order dependent when not neededLei Huang2017-10-111-5/+5
| | | | llvm-svn: 315482
* Convert an ErrorOr to Expected.Rafael Espindola2017-10-111-1/+3
| | | | | | | getRelocationAddend should never be called on non SHT_RELA sections, but changing that requires changing RelocVisitor.h. llvm-svn: 315473
* [Pipeliner] Improve serialization order for post-incrementsKrzysztof Parzyszek2017-10-111-0/+37
| | | | | | | | | | | | | | | | | | | The pipeliner is generating a serial sequence that causes poor register allocation when a post-increment instruction appears prior to the use of the post-increment register. This occurs when there is a circular set of dependences involved with a sequence of instructions in the same cycle. In this case, there is no serialization of the parallel semantics that will not cause an additional register to be allocated. This patch fixes the problem by changing the instructions so that the post-increment instruction is used by the subsequent instruction, which enables the register allocator to make a better decision and not require another register. Patch by Brendon Cahoon. llvm-svn: 315466
* [InstCombine] add baseline tests for D38531; NFCSanjay Patel2017-10-111-1/+103
| | | | llvm-svn: 315461
* [DAGCombiner] convert insertelement of bitcasted vector into shuffleSanjay Patel2017-10-112-48/+15
| | | | | | | | | | | | | | | | Eg: insert v4i32 V, (v2i16 X), 2 --> shuffle v8i16 V', X', {0,1,2,3,8,9,6,7} This is a generalization of the IR fold in D38316 to handle insertion into a non-undef vector. We may want to abandon that one if we can't find value in squashing the more specific pattern sooner. We're using the existing legal shuffle target hook to avoid AVX512 horror with vXi1 shuffles. There may be room for improvement in the shuffle lowering here, but that would be follow-up work. Differential Revision: https://reviews.llvm.org/D38388 llvm-svn: 315460
* Revert "[dsymutil] Timestmap verification for __swift_ast"Jonas Devlieghere2017-10-113-34/+3
| | | | | | This reverts commit r315456. llvm-svn: 315458
* [dsymutil] Timestmap verification for __swift_astJonas Devlieghere2017-10-113-3/+34
| | | | | | | | | | | | This patch adds timestamp verification for swiftmodule files. - A new flag is provided to allows us to continue testing of the code for embedding the__swift_ast. (git doesn't maintain timestamps) - Adds a new test for fat (arm) binaries. Differential revision: https://reviews.llvm.org/D38686 llvm-svn: 315456
* [mips] Add missing tests from rL315451Simon Dardis2017-10-114-0/+704
| | | | llvm-svn: 315454
* [X86] Added tests for TESTM and TESTNM (NFC)Uriel Korach2017-10-114-0/+1148
| | | | | | | | Adding this test files now so after another commit that will add a new pattern for TESTM and TESTNM instructions will show the improvemnts that have been done. Change-Id: If3908b7f91897d764053312365a2bc1de78b291d llvm-svn: 315443
* [GVN] Prevent LoadPRE from hoisting across instructions that don't pass ↵Max Kazantsev2017-10-112-0/+174
| | | | | | | | | | | | | | | | | | control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440
* [LICM] Disallow sinking of unordered atomic loads into loopsMax Kazantsev2017-10-111-1/+94
| | | | | | | | | | | | | | | | | | | | | | | | Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this **prohibits any transformation that > transforms a single load into multiple loads**, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438
* [IRCE] Do not process empty safe rangesMax Kazantsev2017-10-112-6/+72
| | | | | | | | | | | | | | | IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437
* [GVN] Don't replace constants with constants.Davide Italiano2017-10-111-0/+13
| | | | | | | | This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429
* Revert "[llvm-objcopy] Add support for --strip-sections to remove all ↵Jake Ehrlich2017-10-111-66/+0
| | | | | | | | section headers leaving only program headers and loadable segment data" This reverts commit rL315412 llvm-svn: 315417
* [llvm-objcopy] Add support for --strip-sections to remove all section ↵Jake Ehrlich2017-10-111-0/+66
| | | | | | | | | | | | | | | | | | headers leaving only program headers and loadable segment data elf utils implements a particularly extreme form of stripping that I'd like to support. eu-strip has an option called "strip-sections" that removes all section headers and leaves only program headers and the segment data. I have implemented this option partly as a test but mainly because in Fuchsia we would like to use this option to minimize the size of our executables. The other strip options that are on my list include --strip-all and --strip-debug. This is a preliminary implementation that I'd like to start using in Fuchsia builds if possible. This change implements such a stripping option for llvm-objcopy Differential Revision: https://reviews.llvm.org/D38335 llvm-svn: 315412
* [X86] Add 128-bit version of vbroadcasti32x2 to shuffle comment decoding.Craig Topper2017-10-112-10/+10
| | | | llvm-svn: 315395
* [llvm-objcopy] Add ability to remove multiple sections by nameJake Ehrlich2017-10-101-0/+130
| | | | | | | | | This change adds the ability to use the "-R"/"-remove-section" option multiple times. Differential Revision: https://reviews.llvm.org/D38332 llvm-svn: 315385
* [X86] Add broadcast patterns that allow a scalar_to_vector between the ↵Craig Topper2017-10-101-2/+1
| | | | | | | | broadcast and the load. We already have these patterns for AVX512VL, but not AVX1 or 2. llvm-svn: 315382
* Make the ELFFile constructor private.Rafael Espindola2017-10-102-0/+4
| | | | | | | | | With this all clients have to use the new create method which returns an Expected. Fixes a crash on invalid input. llvm-svn: 315376
* Make the ELFObjectFile constructor private.Rafael Espindola2017-10-101-4/+4
| | | | | | | This forces every user to use the new create method that returns an Expected. This in turn propagates better error messages. llvm-svn: 315371
* Use the first instruction's count to estimate the funciton's entry frequency.Dehao Chen2017-10-101-0/+4
| | | | | | | | | | | | | | Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D38478 llvm-svn: 315369
* [x86] fix prefix typos for CHECK lines; NFCSanjay Patel2017-10-101-586/+290
| | | | llvm-svn: 315368
* [mips] Correct the instruction predicates for microMIPSr3Simon Dardis2017-10-101-1/+2
| | | | | | | | | | | | | | | | Rather than using the AdditionalPredicates mechanism to guard the microMIPS instructions, use the existing predicates to properly guard those instructions. This also resolves a case where an instruction pattern was incorrectly available for microMIPS32R6, which caused a register allocation failure as the registers specified in the pattern were not available. Reviewers: nitesh.jain, atanasyan Differential Revision: https://reviews.llvm.org/D38451 llvm-svn: 315362
* AMDGPU: Fix missing skipFunction callsMatt Arsenault2017-10-101-1/+1
| | | | llvm-svn: 315361
* AMDGPU: Fix failure to select branch with optnoneMatt Arsenault2017-10-101-0/+54
| | | | | | | | opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass. The heuristic in the custom selector for brcond deferred the branch uniformity check to the pattern, which would fail. llvm-svn: 315360
OpenPOWER on IntegriCloud