summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* Update the comments for the macho-invalid-zero-ncmds test and fixKevin Enderby2016-01-261-2/+6
| | | | | | | | llvm-objdump when printing the Mach Header to print the unknown cputype and cpusubtype fields as decimal instead of not printing them at all. And change the test to check for that. llvm-svn: 258826
* [LibCallSimplifier] fold memset(malloc(x), 0, x) --> calloc(1, x)Sanjay Patel2016-01-261-0/+11
| | | | | | | | | | | | | This is a step towards solving PR25892: https://llvm.org/bugs/show_bug.cgi?id=25892 It won't handle the reported case. As noted by the 'TODO' comments in the patch, we need to relax the hasOneUse() constraint and also match patterns that include memset_chk() and the llvm.memset() intrinsic in addition to memset(). Differential Revision: http://reviews.llvm.org/D16337 llvm-svn: 258816
* Revert "Reapply commit r258404 with fix"Matthew Simpson2016-01-261-18/+13
| | | | | | | | This commit exposes a crash in computeKnownBits on the Chromium buildbots. Reverting to investigate. Reference: https://llvm.org/bugs/show_bug.cgi?id=26307 llvm-svn: 258812
* Re-submit r256008 "Improve DWARFDebugFrame::parse to also handle __eh_frame."Igor Laevsky2016-01-261-22/+22
| | | | | | | Originally this change was causing failures on windows buildbots. But those problems were fixed in r258806. llvm-svn: 258811
* [X86][SSE] Add zero element and general 64-bit VZEXT_LOAD support to ↵Simon Pilgrim2016-01-261-45/+7
| | | | | | | | | | | | | | EltsFromConsecutiveLoads This patch adds support for trailing zero elements to VZEXT_LOAD loads (and checks that no zero elts occur within the consecutive load). It also generalizes the 64-bit VZEXT_LOAD load matching to work for loads other than 2x32-bit loads. After this patch it will also be easier to add support for other basic load patterns like 32-bit VZEXT_LOAD loads, PMOVZX and subvector load insertion. Differential Revision: http://reviews.llvm.org/D16217 llvm-svn: 258798
* AMDGPU: Make v32i8/v64i8 illegal typesMatt Arsenault2016-01-263-76/+187
| | | | | | | | Old intrinsics were forcing these, but they have now all been removed. This fixes large i8 vector operations generally being broken. llvm-svn: 258788
* AMDGPU: Remove old sample intrinsicsMatt Arsenault2016-01-2611-662/+138
| | | | | | | | | | | I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787
* AMDGPU: Add new amdgcn intrinsics for cube instructionsMatt Arsenault2016-01-266-2/+107
| | | | | | | More cleanup to try to get all intrinsics using the correct amdgcn prefix that are as close to the instruction as possible. llvm-svn: 258786
* AMDGPU: Implement read_register and write_register intrinsicsMatt Arsenault2016-01-266-0/+224
| | | | | | | | | | | | | | Some of the special intrinsics now that now correspond to a instruction also have special setting of some registers, e.g. llvm.SI.sendmsg sets m0 as well as use s_sendmsg. Using these explicit register intrinsics may be a better option. Reading the exec mask and others may be useful for debugging. For this I'm not sure this is entirely correct because we would want this to be convergent, although it's possible this is already treated sufficently conservatively. llvm-svn: 258785
* AMDGPU: Restore AMDGPU prefixed rsq intrinsic for nowMatt Arsenault2016-01-262-0/+56
| | | | | | Also move into backend intrinsics to discourage use of the old name. llvm-svn: 258783
* [WebAssembly] Optimize memcpy/memmove/memcpy calls.Dan Gohman2016-01-262-2/+62
| | | | | | | | These calls return their first argument, but because LLVM uses an intrinsic with a void return type, they can't use the returned attribute. Generalize the store results pass to optimize these calls too. llvm-svn: 258781
* [WebAssembly] Implement unaligned loads and stores.Dan Gohman2016-01-264-10/+556
| | | | | | Differential Revision: http://reviews.llvm.org/D16534 llvm-svn: 258779
* [LIR] Add support for structs and hand unrolled loopsHaicheng Wu2016-01-263-0/+487
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a recommit of r258620 which causes PR26293. The original message: Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258777
* Followup to 258750; update more tests to use .p2align .Dan Gohman2016-01-264-5/+5
| | | | llvm-svn: 258755
* Followup to 258750; update all MC tests to use .p2align .Dan Gohman2016-01-2610-75/+75
| | | | llvm-svn: 258754
* Followup to 258750; update this test to use .p2align .Dan Gohman2016-01-261-2/+2
| | | | llvm-svn: 258752
* [MC] Use .p2align instead of .alignDan Gohman2016-01-2672-233/+233
| | | | | | | | | | | | | | | For historic reasons, the behavior of .align differs between targets. Fortunately, there are alternatives, .p2align and .balign, which make the interpretation of the parameter explicit, and which behave consistently across targets. This patch teaches MC to use .p2align instead of .align, so that people reading code for multiple architectures don't have to remember which way each platform does its .align directive. Differential Revision: http://reviews.llvm.org/D16549 llvm-svn: 258750
* [cfi] Cross-DSO CFI diagnostic mode (LLVM part).Evgeniy Stepanov2016-01-251-10/+15
| | | | | | | | * __cfi_check gets a 3rd argument: ubsan handler data * Instead of trapping on failure, call __cfi_check_fail which must be present in the module (generated in the frontend). llvm-svn: 258746
* X86ISelLowering: Fix cmov(cmov) special lowering bugMatthias Braun2016-01-251-0/+49
| | | | | | | | | | | | | | | | There's a special case in EmitLoweredSelect() that produces an improved lowering for cmov(cmov) patterns. However this special lowering is currently broken if the inner cmov has multiple users so this patch stops using it in this case. If you wonder why this wasn't fixed by continuing to use the special lowering and inserting a 2nd PHI for the inner cmov: I believe this would incur additional copies/register pressure so the special lowering does not improve upon the normal one anymore in this case. This fixes http://llvm.org/PR26256 (= rdar://24329747) llvm-svn: 258729
* [ThinLTO] Find all needed metadata when linking metadata as postpassTeresa Johnson2016-01-251-3/+13
| | | | | | | | | For metadata postpass linking, after importing all functions, we need to recursively walk through any nodes reached via imported functions to locate needed subprogram metadata. Some might only be reached indirectly via the variable list for an inlined function. llvm-svn: 258728
* [X86][AVX] Add commutation support for VPERM2X128 instructions Simon Pilgrim2016-01-251-0/+171
| | | | | | | | Its main use is to allow memory folding of the 1st operand Differential Revision: http://reviews.llvm.org/D16521 llvm-svn: 258726
* [ThinLTO] Handle DISubprogram reached indirectly from DIImportedEntityTeresa Johnson2016-01-251-2/+24
| | | | | | | Extend fix for PR26037 to identify DISubprogram reached from a DIImportedEntity via a DILexicalBlock. llvm-svn: 258722
* Enable loopreroll to rerool loop with pointer induction variable.Lawrence Hu2016-01-251-0/+81
| | | | | | | | | | | | | | Example: while (buf !=end ) { S += buf[0]; S += buf[1]; buf +=2; }; Differential Revision: http://reviews.llvm.org/D13151 llvm-svn: 258709
* Undo commit 258700 due to missing commit messageLawrence Hu2016-01-251-81/+0
| | | | llvm-svn: 258708
* Reapply commit r25804 with fixMatthew Simpson2016-01-251-13/+18
| | | | | | | | | | | We were hitting an assertion because we were computing smaller type sizes for instructions that cannot be demoted. The fix first determines the instructions that will be demoted, and then applies the smaller type size to only those instructions. This should fix PR26239. llvm-svn: 258705
* Speculatively revert r258620 as it is the likely culprid of PR26293.Quentin Colombet2016-01-253-487/+0
| | | | llvm-svn: 258703
* Differential Revision: http://reviews.llvm.org/D13151Lawrence Hu2016-01-251-0/+81
| | | | llvm-svn: 258700
* [WebAssembly] Fix unbalanced register stack code in the case of late DCE.Dan Gohman2016-01-251-2/+2
| | | | | | | Instructions can be DCE'd after the RegStackify pass. If the instruction which would be the pop for what would be a push is removed, don't use a push. llvm-svn: 258694
* [WebAssembly] Add tests for negative offsets with global variable addresses.Dan Gohman2016-01-251-0/+18
| | | | llvm-svn: 258693
* [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.Dan Gohman2016-01-251-1/+1
| | | | | | | | | | | | | When generating calls to memcpy, memmove, and memset, use void* as the return type rather than void, to match the standard signatures for these functions. This has no practical effect for most targets, since the return values of these calls aren't being used anyway, and most calling conventions tolerate this kind of mismatch. However, this change will help support future optimizations to utilize the return value to avoid holding the argument value live across a call. llvm-svn: 258691
* [DemandedBits] Fix computation of demanded bits for ICmpsJames Molloy2016-01-251-2/+11
| | | | | | | | | | The computation of ICmp demanded bits is independent of the individual operand being evaluated. We simply return a mask consisting of the minimum leading zeroes of both operands. We were incorrectly passing "I" to ComputeKnownBits - this should be "UserI->getOperand(0)". In cases where we were evaluating the 1th operand, we were taking the minimum leading zeroes of it and itself. This should fix PR26266. llvm-svn: 258690
* [AVX512] Adding PTESTNMB/D/W/Q instructionMichael Zuckerman2016-01-254-0/+218
| | | | | | Differential Revision: http://reviews.llvm.org/D16520 llvm-svn: 258688
* [AVX512] Adding PTESTMB/W/D/Q instruction Michael Zuckerman2016-01-253-0/+180
| | | | | | Differential Revision: http://reviews.llvm.org/D16519 llvm-svn: 258686
* [ARM] Add DSP build attribute and extension targetingBradley Smith2016-01-256-4/+55
| | | | | | | | This patch was originally committed as r257885, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258683
* [ARM] Add new system registers to ARMv8-M Baseline/MainlineBradley Smith2016-01-254-0/+417
| | | | | | | | This patch was originally committed as r257884, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258682
* [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/MainlineBradley Smith2016-01-251-2/+93
| | | | | | | | This patch was originally committed as r257883, but was reverted due to windows failures. The cause of these failures has been fixed under r258677, hence re-committing the original patch. llvm-svn: 258681
* [X86][IFMA] adding intrinsics and encoding for multiply and add of unsigned ↵Asaf Badouh2016-01-254-0/+733
| | | | | | | | | | | 52bit integer VPMADD52LUQ - Packed Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Qword Accumulators VPMADD52HUQ - Packed Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to 64-bit Accumulators Differential Revision: http://reviews.llvm.org/D16407 llvm-svn: 258680
* [ARM] Add ARMv8.2-A FP16 scalar instructionsOliver Stannard2016-01-256-0/+1192
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was originally committed as r255762, but reverted as it broke windows bots. Re-commitiing the exact same patch, as the underlying cause was fixed by r258677. ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. The assembly for these instructions uses S registers (AArch32 does not have H registers), but the instructions have ".f16" type specifiers rather than ".f32" or ".f64". The top 16 bits of each source register are ignored, and the top 16 bits of the destination register are set to zero. These instructions are mostly the same as the 32- and 64-bit versions, but they use coprocessor 9 rather than 10 and 11. Two new instructions, VMOVX and VINS, have been added to allow packing and extracting two 16-bit floats stored in the top and bottom halves of an S register. New fixup kinds have been added for the PC-relative load and store instructions, but no ELF relocations have been added as they have a range of 512 bytes. Differential Revision: http://reviews.llvm.org/D15038 llvm-svn: 258678
* AVX1 : Enable vector masked_load/store to AVX1.Igor Breger2016-01-252-637/+831
| | | | | | | | Use AVX1 FP instructions (vmaskmovps/pd) in place of the AVX2 int instructions (vpmaskmovd/q). Differential Revision: http://reviews.llvm.org/D16528 llvm-svn: 258675
* [AVX512] [CMPPS ][ CMPPD ] Adding full Comparison Predicate names Michael Zuckerman2016-01-251-0/+208
| | | | | | | | | | | X86AsmParser.cpp is missing full comparison predicate names for CMPPD and CMPPS Instructions. X86AsmParser.cpp defines only the short names of the Comparison predicate that you can find in the following pdf: https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Page 5-61 table 5-3 Differential Revision: http://reviews.llvm.org/D16518 llvm-svn: 258671
* Added Skylake client to X86 targets and featuresElena Demikhovsky2016-01-241-56/+196
| | | | | | | | | | | | | Changes in X86.td: I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X .. I added Skylake client processor and defined it's features FeatureADX was missing on KNL Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others Differential Revision: http://reviews.llvm.org/D16357 llvm-svn: 258659
* AVX512: VMOVDQU8/16/32/64 (load) intrinsic implementation.Igor Breger2016-01-244-1/+241
| | | | | | Differential Revision: http://reviews.llvm.org/D16137 llvm-svn: 258657
* [WinEH] Don't miscompile cleanups which conditionally unwind to callerDavid Majnemer2016-01-231-0/+24
| | | | | | | | | | | | | | | | A cleanup can have paths which unwind or end up in unreachable. If there is an unreachable path *and* a path which unwinds to caller, we would mistakenly inject an unwind path to a catchswitch on the unreachable path. This results in a verifier assertion firing because the cleanup unwinds to two different places: to the caller and to the catchswitch. This occured because we used getCleanupRetUnwindDest to determine if the cleanuppad had no cleanuprets. This is incorrect, getCleanupRetUnwindDest returns null for cleanuprets which unwind to caller. llvm-svn: 258651
* [SelectionDAG] Generalised the CONCAT_VECTORS creation to support ↵Simon Pilgrim2016-01-231-2/+2
| | | | | | BUILD_VECTOR and UNDEF folding. llvm-svn: 258646
* [CUDA] Die gracefully when trying to output an LLVM alias.Justin Lebar2016-01-231-0/+7
| | | | | | | | | | | | | | Summary: Previously, we would just output "foo = bar" in the assembly, and then ptxas would choke. Now we die before emitting any invalid code. Reviewers: echristo Subscribers: jholewinski, llvm-commits, jhen, tra Differential Revision: http://reviews.llvm.org/D16490 llvm-svn: 258638
* regenerate checks and note some near-term improvementsSanjay Patel2016-01-231-239/+1100
| | | | | | | | For the moment, this file takes way too long to run (see inline comments), but that should be a temporary problem. The fact that the compile time is so slow for a target that doesn't support maskmov may be a bug worth investigating too. llvm-svn: 258629
* [X86][SSE] Remove INSERTPS dependencies from unreferenced operands.Simon Pilgrim2016-01-231-0/+32
| | | | | | If the INSERTPS zeroes out all the referenced elements from either of the 2 input vectors (and the input is not already UNDEF), then set that input to UNDEF to reduce dependencies. llvm-svn: 258622
* [LIR] Add support for structs and hand unrolled loopsHaicheng Wu2016-01-233-0/+487
| | | | | | | | | | | | | | | | | | | | | | | | | Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258620
* [PruneEH] Don't try to insert a terminator after another terminatorDavid Majnemer2016-01-231-0/+26
| | | | | | LLVM's BasicBlock has a single terminator, it is not valid to have two. llvm-svn: 258616
* Put space after pointer type in test. NFC.Manuel Jacob2016-01-231-2/+2
| | | | llvm-svn: 258615
OpenPOWER on IntegriCloud