summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [ThinLTO] Handle DISubprogram reached indirectly from DIImportedEntityTeresa Johnson2016-01-251-2/+24
| | | | | | | Extend fix for PR26037 to identify DISubprogram reached from a DIImportedEntity via a DILexicalBlock. llvm-svn: 258722
* Enable loopreroll to rerool loop with pointer induction variable.Lawrence Hu2016-01-251-0/+81
| | | | | | | | | | | | | | Example: while (buf !=end ) { S += buf[0]; S += buf[1]; buf +=2; }; Differential Revision: http://reviews.llvm.org/D13151 llvm-svn: 258709
* Undo commit 258700 due to missing commit messageLawrence Hu2016-01-251-81/+0
| | | | llvm-svn: 258708
* Reapply commit r25804 with fixMatthew Simpson2016-01-251-13/+18
| | | | | | | | | | | We were hitting an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239. llvm-svn: 258705
* Speculatively revert r258620 as it is the likely culprid of PR26293.Quentin Colombet2016-01-253-487/+0
| | | | llvm-svn: 258703
* Differential Revision: http://reviews.llvm.org/D13151Lawrence Hu2016-01-251-0/+81
| | | | llvm-svn: 258700
* [WebAssembly] Fix unbalanced register stack code in the case of late DCE.Dan Gohman2016-01-251-2/+2
| | | | | | | Instructions can be DCE'd after the RegStackify pass. If the instruction which would be the pop for what would be a push is removed, don't use a push. llvm-svn: 258694
* [WebAssembly] Add tests for negative offsets with global variable addresses.Dan Gohman2016-01-251-0/+18
| | | | llvm-svn: 258693
* [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.Dan Gohman2016-01-251-1/+1
| | | | | | | | | | | | | When generating calls to memcpy, memmove, and memset, use void* as the return type rather than void, to match the standard signatures for these functions. This has no practical effect for most targets, since the return values of these calls aren't being used anyway, and most calling conventions tolerate this kind of mismatch. However, this change will help support future optimizations to utilize the return value to avoid holding the argument value live across a call. llvm-svn: 258691
* [DemandedBits] Fix computation of demanded bits for ICmpsJames Molloy2016-01-251-2/+11
| | | | | | | | | | The computation of ICmp demanded bits is independent of the individual operand being evaluated. We simply return a mask consisting of the minimum leading zeroes of both operands. We were incorrectly passing "I" to ComputeKnownBits - this should be "UserI->getOperand(0)". In cases where we were evaluating the 1th operand, we were taking the minimum leading zeroes of it and itself. This should fix PR26266. llvm-svn: 258690
* [AVX512] Adding PTESTNMB/D/W/Q instructionMichael Zuckerman2016-01-254-0/+218
| | | | | | Differential Revision: http://reviews.llvm.org/D16520 llvm-svn: 258688
* [AVX512] Adding PTESTMB/W/D/Q instruction Michael Zuckerman2016-01-253-0/+180
| | | | | | Differential Revision: http://reviews.llvm.org/D16519 llvm-svn: 258686
* [ARM] Add DSP build attribute and extension targetingBradley Smith2016-01-256-4/+55
| | | | | | | | This patch was originally committed as r257885, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258683
* [ARM] Add new system registers to ARMv8-M Baseline/MainlineBradley Smith2016-01-254-0/+417
| | | | | | | | This patch was originally committed as r257884, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258682
* [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/MainlineBradley Smith2016-01-251-2/+93
| | | | | | | | This patch was originally committed as r257883, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258681
* [X86][IFMA] adding intrinsics and encoding for multiply and add of unsigned ↵Asaf Badouh2016-01-254-0/+733
| | | | | | | | | | | 52bit integer VPMADD52LUQ - Packed Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Qword Accumulators VPMADD52HUQ - Packed Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to 64-bit Accumulators Differential Revision: http://reviews.llvm.org/D16407 llvm-svn: 258680
* [ARM] Add ARMv8.2-A FP16 scalar instructionsOliver Stannard2016-01-256-0/+1192
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally committed as r255762, but reverted as it broke windows bots. Re-commitiing the exact same patch, as the underlying cause was fixed by r258677. ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. The assembly for these instructions uses S registers (AArch32 does not have H registers), but the instructions have ".f16" type specifiers rather than ".f32" or ".f64". The top 16 bits of each source register are ignored, and the top 16 bits of the destination register are set to zero. These instructions are mostly the same as the 32- and 64-bit versions, but they use coprocessor 9 rather than 10 and 11. Two new instructions, VMOVX and VINS, have been added to allow packing and extracting two 16-bit floats stored in the top and bottom halves of an S register. New fixup kinds have been added for the PC-relative load and store instructions, but no ELF relocations have been added as they have a range of 512 bytes. Differential Revision: http://reviews.llvm.org/D15038 llvm-svn: 258678
* AVX1 : Enable vector masked_load/store to AVX1.Igor Breger2016-01-252-637/+831
| | | | | | | | Use AVX1 FP instructions (vmaskmovps/pd) in place of the AVX2 int instructions (vpmaskmovd/q). Differential Revision: http://reviews.llvm.org/D16528 llvm-svn: 258675
* [AVX512] [CMPPS ][ CMPPD ] Adding full Comparison Predicate names Michael Zuckerman2016-01-251-0/+208
| | | | | | | | | | | X86AsmParser.cpp is missing full comparison predicate names for CMPPD and CMPPS Instructions. X86AsmParser.cpp defines only the short names of the Comparison predicate that you can find in the following pdf: https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Page 5-61 table 5-3 Differential Revision: http://reviews.llvm.org/D16518 llvm-svn: 258671
* Added Skylake client to X86 targets and featuresElena Demikhovsky2016-01-241-56/+196
| | | | | | | | | | | | | Changes in X86.td: I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X .. I added Skylake client processor and defined it's features FeatureADX was missing on KNL Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others Differential Revision: http://reviews.llvm.org/D16357 llvm-svn: 258659
* AVX512: VMOVDQU8/16/32/64 (load) intrinsic implementation.Igor Breger2016-01-244-1/+241
| | | | | | Differential Revision: http://reviews.llvm.org/D16137 llvm-svn: 258657
* [WinEH] Don't miscompile cleanups which conditionally unwind to callerDavid Majnemer2016-01-231-0/+24
| | | | | | | | | | | | | | | | A cleanup can have paths which unwind or end up in unreachable. If there is an unreachable path *and* a path which unwinds to caller, we would mistakenly inject an unwind path to a catchswitch on the unreachable path. This results in a verifier assertion firing because the cleanup unwinds to two different places: to the caller and to the catchswitch. This occured because we used getCleanupRetUnwindDest to determine if the cleanuppad had no cleanuprets. This is incorrect, getCleanupRetUnwindDest returns null for cleanuprets which unwind to caller. llvm-svn: 258651
* [SelectionDAG] Generalised the CONCAT_VECTORS creation to support ↵Simon Pilgrim2016-01-231-2/+2
| | | | | | BUILD_VECTOR and UNDEF folding. llvm-svn: 258646
* [CUDA] Die gracefully when trying to output an LLVM alias.Justin Lebar2016-01-231-0/+7
| | | | | | | | | | | | | | Summary: Previously, we would just output "foo = bar" in the assembly, and then ptxas would choke. Now we die before emitting any invalid code. Reviewers: echristo Subscribers: jholewinski, llvm-commits, jhen, tra Differential Revision: http://reviews.llvm.org/D16490 llvm-svn: 258638
* regenerate checks and note some near-term improvementsSanjay Patel2016-01-231-239/+1100
| | | | | | | | For the moment, this file takes way too long to run (see inline comments), but that should be a temporary problem. The fact that the compile time is so slow for a target that doesn't support maskmov may be a bug worth investigating too. llvm-svn: 258629
* [X86][SSE] Remove INSERTPS dependencies from unreferenced operands.Simon Pilgrim2016-01-231-0/+32
| | | | | | If the INSERTPS zeroes out all the referenced elements from either of the 2 input vectors (and the input is not already UNDEF), then set that input to UNDEF to reduce dependencies. llvm-svn: 258622
* [LIR] Add support for structs and hand unrolled loopsHaicheng Wu2016-01-233-0/+487
| | | | | | | | | | | | | | | | | | | | | | | | | Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258620
* [PruneEH] Don't try to insert a terminator after another terminatorDavid Majnemer2016-01-231-0/+26
| | | | | | LLVM's BasicBlock has a single terminator, it is not valid to have two. llvm-svn: 258616
* Put space after pointer type in test. NFC.Manuel Jacob2016-01-231-2/+2
| | | | llvm-svn: 258615
* AMDGPU: Replace some deprecated intrinsic uses in testsMatt Arsenault2016-01-239-62/+59
| | | | llvm-svn: 258614
* AMDGPU: Run instnamer on a few testsMatt Arsenault2016-01-234-1727/+1713
| | | | | | This will make future test updates easier llvm-svn: 258613
* AMDGPU: Remove more unused intrinsicsMatt Arsenault2016-01-236-1831/+1943
| | | | | | Replace tests with lrp with basic IR expansion llvm-svn: 258612
* [PruneEH] FuncletPads must not have undef operandsDavid Majnemer2016-01-231-0/+30
| | | | | | | | | Instead of RAUW with undef, replace the first non-token instruction with unreachable. This fixes PR26263. llvm-svn: 258611
* AArch64ISel: Fix ccmp code selection matching deep expressions.Matthias Braun2016-01-231-0/+19
| | | | | | | | | | | | Some of the conditions necessary to produce ccmp sequences were only checked in recursive calls to emitConjunctionDisjunctionTree() after some of the earlier expressions were already built. Move all checks over to isConjunctionDisjunctionTree() so they are all checked before we start emitting instructions. Also rename some variable to better reflect their usage. llvm-svn: 258605
* [WinEH] Let cleanups post-dominated by unreachable get executedDavid Majnemer2016-01-224-4/+76
| | | | | | | | | | | | | | | | | | | | | | | | | Cleanups in C++ are a little weird. They are only guaranteed to be reliably executed if, and only if, there is a viable catch handler which can handle the exception. This means that reachability of a cleanup is lexically determined by it being nested with a try-block which unwinds to a catch. It is *cannot* be reasoned about by examining the control flow edges leaving a cleanup. Usually this is not a problem. It becomes a problem when there are *no* edges out of a cleanup because we believed that code post-dominated by the cleanup is dead. In LLVM's case, this code is what informs the personality routine about the presence of a suitable catch handler. However, the lack of edges to that catch handler makes the handler become unreachable which causes us to remove it. By removing the handler, the cleanup becomes unreachable. Instead, inject a catch-all handler with every cleanup that has no unwind edges. This will allow us to properly unwind the stack. This fixes PR25997. llvm-svn: 258580
* Fix the code that leads to the incorrect trigger of the report_fatal_error()Kevin Enderby2016-01-222-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in MachOObjectFile::getSymbolByIndex() when a Mach-O file has a symbol table load command but the number of symbols are zero. The code in MachOObjectFile::symbol_begin_impl() should not be assuming there is a symbol at index 0, in cases there is no symbol table load command or the count of symbol is zero. So I also fixed that. And needed to fix MachOObjectFile::symbol_end_impl() to also do the same thing for no symbol table or one with zero entries. The code in MachOObjectFile::getSymbolByIndex() should trigger the report_fatal_error() for programmatic errors for any index when there is no symbol table load command and not return the end iterator. So also fixed that. Note there is no test case as this is a programmatic error. The test case using the file macho-invalid-bad-symbol-index has a symbol table load command with its number of symbols (nsyms) is zero. Which was incorrectly testing the bad triggering of the report_fatal_error() in in MachOObjectFile::getSymbolByIndex(). This test case is an invalid Mach-O file but not for that reason. It appears this Mach-O file use to have an nsyms value of 11, and what makes this Mach-O file invalid is the counts and indexes into the symbol table of the dynamic load command are now invalid because the number of symbol table entries (nsyms) is now zero. Which can be seen with the existing llvm-obdump: % llvm-objdump -private-headers macho-invalid-bad-symbol-index … Load command 4 cmd LC_SYMTAB cmdsize 24 symoff 4216 nsyms 0 stroff 4392 strsize 144 Load command 5 cmd LC_DYSYMTAB cmdsize 80 ilocalsym 0 nlocalsym 8 (past the end of the symbol table) iextdefsym 8 (greater than the number of symbols) nextdefsym 2 (past the end of the symbol table) iundefsym 10 (greater than the number of symbols) nundefsym 1 (past the end of the symbol table) ... And the native darwin tools generates an error for this file: % nm macho-invalid-bad-symbol-index nm: object: macho-invalid-bad-symbol-index truncated or malformed object (ilocalsym plus nlocalsym in LC_DYSYMTAB load command extends past the end of the symbol table) I added new checks for the indexes and sizes for these in the constructor of MachOObjectFile. And added comments for what would be a proper diagnostic messages. And changed the test case using macho-invalid-bad-symbol-index to test for the new error now produced. Also added a test with a valid Mach-O file with a symbol table load command where the number of symbols is zero that shows the report_fatal_error() is not called. llvm-svn: 258576
* fixed to test features, not CPU modelsSanjay Patel2016-01-221-73/+73
| | | | llvm-svn: 258568
* AMDGPU: Add new name for barrier intrinsicMatt Arsenault2016-01-221-0/+28
| | | | llvm-svn: 258558
* AMDGPU: Rename intrinsics to use amdgcn prefixMatt Arsenault2016-01-2220-222/+318
| | | | | | | | | | | The intrinsic target prefix should match the target name as it appears in the triple. This is not yet complete, but gets most of the important ones. llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled for compatability for now. llvm-svn: 258557
* Make sure that any new and optimized objects created during GlobalOPT copy ↵Sergei Larin2016-01-223-0/+85
| | | | | | | | | | | | | | | | | | | all the attributes from the base object. Summary: Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object. A good example of improper behavior in the current implementation is section information associated with the GlobalObject. If a section was set for it, and GlobalOpt is creating/modifying a new object based on this one (often copying the original name), without this change new object will be placed in a default section, resulting in inappropriate properties of the new variable. The argument here is that if customer specified a section for a variable, any changes to it that compiler does should not cause it to change that section allocation. Moreover, any other properties worth representation in copyAttributesFrom() should also be propagated. Reviewers: jmolloy, joker-eph, joker.eph Subscribers: slarin, joker.eph, rafael, tobiasvk, llvm-commits Differential Revision: http://reviews.llvm.org/D16074 llvm-svn: 258556
* [PlaceSafepoints] Introduce a -spp-no-statepoints flagSanjoy Das2016-01-221-0/+21
| | | | | | | | | | | | | | | | | | | | Summary: This change adds a `-spp-no-statepoints` flag to PlaceSafepoints that bypasses the code that wraps newly introduced polls and existing calls in gc.statepoint. With `-spp-no-statepoints` enabled, PlaceSafepoints effectively becomes a safpeoint **poll** insertion pass. The eventual goal is to "constant fold" this option, along with `-rs4gc-use-deopt-bundles` to `true`, once clients using gc.statepoint are okay doing so. Reviewers: pgavlin, reames, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16439 llvm-svn: 258551
* [AArch64] Cleanup ccmp test check labels. NFC.Ahmed Bougacha2016-01-221-10/+10
| | | | llvm-svn: 258541
* AMDGPU: Fix crash with invariant markersMatt Arsenault2016-01-221-0/+25
| | | | | | | | The promote alloca pass didn't handle these intrinsics and crashed. These intrinsics should accept any address space, but for now just erase them to avoid breaking. llvm-svn: 258537
* [NVPTX] expand mul_lohi to mul_lo and mul_hiJingyue Wu2016-01-221-0/+24
| | | | | | | | | | | | Summary: Fixes PR26186. Reviewers: grosser, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D16479 llvm-svn: 258536
* [AArch64] Lower 2-CC FCCMPs (one/ueq) using AND'ed CCs.Ahmed Bougacha2016-01-221-18/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current behavior is incorrect, as the two CCs returned by changeFPCCToAArch64CC, intended to be OR'ed, are instead used in an AND ccmp chain. Consider: define i32 @t(float %a, float %b, float %c, float %d, i32 %e, i32 %f) { %cc1 = fcmp one float %a, %b %cc2 = fcmp olt float %c, %d %and = and i1 %cc1, %cc2 %r = select i1 %and, i32 %e, i32 %f ret i32 %r } Assuming (%a < %b) and (%c < %d); we used to do: fcmp s0, s1 # nzcv <- 1000 orr w8, wzr, #0x1 # w8 <- 1 csel w9, w8, wzr, mi # w9 <- 1 csel w8, w8, w9, gt # w8 <- 1 fcmp s2, s3 # nzcv <- 1000 cset w9, mi # w9 <- 1 tst w8, w9 # (w8 & w9) == 1, so: nzcv <- 0000 csel w0, w0, w1, ne # w0 <- w0 We now do: fcmp s2, s3 # nzcv <- 1000 fccmp s0, s1, #0, mi # mi, so: nzcv <- 1000 fccmp s0, s1, #8, le # !le, so: nzcv <- 1000 csel w0, w0, w1, pl # !pl, so: w0 <- w1 In other words, we transformed: (c < d) && ((a < b) || (a > b)) into: (c < d) && (a u>= b) && (a u<= b) whereas, per De Morgan's, we wanted: (c < d) && !((a u>= b) && (a u<= b)) Note that this problem doesn't occur in the test-suite. changeFPCCToAArch64CC produces disjunct CCs; here, one -> mi/gt. We can't represent that in the fccmp chain; it can't express arbitrary OR sequences, as one comment explains: In general we can create code for arbitrary "... (and (and A B) C)" sequences. We can also implement some "or" expressions, because "(or A B)" is equivalent to "not (and (not A) (not B))" and we can implement some negation operations. [...] However there is no way to negate the result of a partial sequence. Instead, introduce changeFPCCToANDAArch64CC, which produces the conjunct cond codes: - (a one b) == ((a olt b) || (a ogt b)) == ((a ord b) && (a une b)) - (a ueq b) == ((a uno b) || (a oeq b)) == ((a ule b) && (a uge b)) Note that, at first, one might think that, when PushNegate is true, we should use the disjunct CCs, in effect doing: (a || b) = !(!a && !(b)) = !(!a && !(b1 || b2)) <- changeFPCCToAArch64CC(b, b1, b2) = !(!a && !b1 && !b2) However, we can take advantage of the fact that the CC is already negated, which lets us avoid special-casing PushNegate and doing the simpler to reason about: (a || b) = !(!a && (!b)) = !(!a && (b1 && b2)) <- changeFPCCToANDAArch64CC(!b, b1, b2) = !(!a && b1 && b2) This makes both emitConditionalCompare cases behave identically, and produces correct ccmp sequences for the 2-CC fcmps. llvm-svn: 258533
* [Hexagon] Use general purpose registers to spill pred/mod registers intoKrzysztof Parzyszek2016-01-221-0/+42
| | | | | | Patch by Tobias Edler Von Koch. llvm-svn: 258527
* AMDGPU: Rename some r600 intrinsics to use correct TargetPrefixMatt Arsenault2016-01-222-9/+9
| | | | | | These ones aren't directly emitted by mesa and inserted by a pass. llvm-svn: 258523
* Fix MachOObjectFile::getSymbolName() to not call report_fatal_error()Kevin Enderby2016-01-221-1/+7
| | | | | | | | | | | but to return object_error::parse_failed.  Then made the code in llvm-nm do for Mach-O files what is done in the darwin native tools which is to print "bad string index" for bad string indexes. Updated the error message in the llvm-objdump test, and added tests to show llvm-nm prints "bad string index" and a test to print the actual bad string index value which in this case is 0xfe000002 when printing the fields as raw hex. llvm-svn: 258520
* AMDGPU: Remove AMDGPU.fract intrinsicMatt Arsenault2016-01-224-71/+87
| | | | | | | Mesa doesn't use this, and this is pattern matched already from fsub x, (ffloor x) llvm-svn: 258513
* [SelectionDAG] Fold more offsets into GlobalAddressesDan Gohman2016-01-225-11/+725
| | | | | | | | This reapplies r258296 and r258366, and also fixes an existing bug in SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the offset in a GlobalAddressSDNode, which is uncovered by those patches. llvm-svn: 258482
OpenPOWER on IntegriCloud