summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Fix copies from physical registers in SIFixSGPRCopiesMatt Arsenault2017-04-291-0/+14
| | | | | | | | | This would assert when there were multiple defs of a physical register. We just need to move all of the users of it. llvm-svn: 301730
* [llvm-pdbdump] Abstract some of the YAML/Raw printing code.Zachary Turner2017-04-292-79/+87
| | | | | | | | | There is a lot of duplicate code for printing line info between YAML and the raw output printer. This introduces a base class that can be shared between the two, and makes some minor cleanups in the process. llvm-svn: 301728
* [ObjCARC] Do not move a release between a call and aAkira Hatanaka2017-04-291-0/+25
| | | | | | | | | | | | | | | | | | | retainAutoreleasedReturnValue that retains the returned value. This commit fixes a bug in ARC optimizer where it moves a release between a call and a retainAutoreleasedReturnValue, causing the returned object to be released before the retainAutoreleasedReturnValue can retain it. This commit accomplishes that by doing a lookahead and checking whether the call prevents the release from moving upwards. In the long term, we should treat the region between the retainAutoreleasedReturnValue and the call as a critical section and disallow moving anything there (possibly using operand bundles). rdar://problem/20449878 llvm-svn: 301724
* [LoopUnswitch] Don't remove instructions with side effects.Davide Italiano2017-04-291-0/+19
| | | | | | | | This fixes PR32818. Differential Revision: https://reviews.llvm.org/D32664 llvm-svn: 301722
* [InstCombine] add tests to show potentially bogus application of DeMorgan (NFC)Sanjay Patel2017-04-281-0/+22
| | | | llvm-svn: 301714
* InferAddressSpaces: Search constant expressions for addrspacecastsMatt Arsenault2017-04-282-3/+20
| | | | | | | These are pretty common when using local memory, and the 64-bit generic addressing is much more expensive to compute. llvm-svn: 301711
* Remove line and file from DINamespace.Adrian Prantl2017-04-2824-50/+50
| | | | | | | | | | | | | | Fixes the issue highlighted in http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html. The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces can prevent LLVM from uniquing types that are in the same namespace. They also don't carry any meaningful information. rdar://problem/17484998 Differential Revision: https://reviews.llvm.org/D32648 llvm-svn: 301706
* InferAddressSpaces: Infer from just addrspacecastsMatt Arsenault2017-04-282-7/+58
| | | | | | | | Eliminates some more cases where some subset of the addressing computation remains flat. Some cases with addrspacecasts in nested constant expressions are still left behind however. llvm-svn: 301704
* [RDF] Correctly calculate lane masks for defsKrzysztof Parzyszek2017-04-281-0/+52
| | | | llvm-svn: 301700
* Properly handle PHIs with subregisters in UnreachableBlockElimKrzysztof Parzyszek2017-04-281-0/+25
| | | | | | | | | When a PHI operand has a subregister, create a COPY instead of simply replacing the PHI output with the input it. Differential Revision: https://reviews.llvm.org/D32650 llvm-svn: 301699
* [Hexagon] Do not move a block if it is on a fall-through pathKrzysztof Parzyszek2017-04-281-0/+71
| | | | llvm-svn: 301698
* [WebAssembly] Add size of section header to data relocation offsets.Sam Clegg2017-04-281-0/+26
| | | | | | | | | | | Also, add test for data relocations and fix addend to be signed. Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D32513 llvm-svn: 301690
* [ValueTracking] Teach isSafeToSpeculativelyExecute() about the speculatable ↵Matt Arsenault2017-04-281-0/+23
| | | | | | | | attribute Patch by Tom Stellard llvm-svn: 301688
* [WebAssembly] Write initial memory in pages not bytesSam Clegg2017-04-282-0/+10
| | | | | | | | Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D32660 llvm-svn: 301687
* Add speculatable function attributeMatt Arsenault2017-04-284-5/+59
| | | | | | | | This attribute tells the optimizer that the function may be speculated. Patch by Tom Stellard llvm-svn: 301680
* AMDGPU: Add new amdgcn.init.exec intrinsicsMarek Olsak2017-04-281-0/+80
| | | | | | | | | | v2: More tests, bug fixes, cosmetic changes. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D31762 llvm-svn: 301677
* [bpf] add bigendian support to disassemblerAlexei Starovoitov2017-04-281-0/+18
| | | | | | | | | . swap 4-bit register encoding, 16-bit offset and 32-bit imm to support big endian archs . add a test Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 301653
* [InlineCost] Improve the cost heuristic for SwitchJun Bum Lim2017-04-281-0/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The motivation example is like below which has 13 cases but only 2 distinct targets ``` lor.lhs.false2: ; preds = %if.then switch i32 %Status, label %if.then27 [ i32 -7012, label %if.end35 i32 -10008, label %if.end35 i32 -10016, label %if.end35 i32 15000, label %if.end35 i32 14013, label %if.end35 i32 10114, label %if.end35 i32 10107, label %if.end35 i32 10105, label %if.end35 i32 10013, label %if.end35 i32 10011, label %if.end35 i32 7008, label %if.end35 i32 7007, label %if.end35 i32 5002, label %if.end35 ] ``` which is compiled into a balanced binary tree like this on AArch64 (similar on X86) ``` .LBB853_9: // %lor.lhs.false2 mov w8, #10012 cmp w19, w8 b.gt .LBB853_14 // BB#10: // %lor.lhs.false2 mov w8, #5001 cmp w19, w8 b.gt .LBB853_18 // BB#11: // %lor.lhs.false2 mov w8, #-10016 cmp w19, w8 b.eq .LBB853_23 // BB#12: // %lor.lhs.false2 mov w8, #-10008 cmp w19, w8 b.eq .LBB853_23 // BB#13: // %lor.lhs.false2 mov w8, #-7012 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_14: // %lor.lhs.false2 mov w8, #14012 cmp w19, w8 b.gt .LBB853_21 // BB#15: // %lor.lhs.false2 mov w8, #-10105 add w8, w19, w8 cmp w8, #9 // =9 b.hi .LBB853_17 // BB#16: // %lor.lhs.false2 orr w9, wzr, #0x1 lsl w8, w9, w8 mov w9, #517 and w8, w8, w9 cbnz w8, .LBB853_23 .LBB853_17: // %lor.lhs.false2 mov w8, #10013 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_18: // %lor.lhs.false2 mov w8, #-7007 add w8, w19, w8 cmp w8, #2 // =2 b.lo .LBB853_23 // BB#19: // %lor.lhs.false2 mov w8, #5002 cmp w19, w8 b.eq .LBB853_23 // BB#20: // %lor.lhs.false2 mov w8, #10011 cmp w19, w8 b.eq .LBB853_23 b .LBB853_3 .LBB853_21: // %lor.lhs.false2 mov w8, #14013 cmp w19, w8 b.eq .LBB853_23 // BB#22: // %lor.lhs.false2 mov w8, #15000 cmp w19, w8 b.ne .LBB853_3 ``` However, the inline cost model estimates the cost to be linear with the number of distinct targets and the cost of the above switch is just 2 InstrCosts. The function containing this switch is then inlined about 900 times. This change use the general way of switch lowering for the inline heuristic. It etimate the number of case clusters with the suitability check for a jump table or bit test. Considering the binary search tree built for the clusters, this change modifies the model to be linear with the size of the balanced binary tree. The model is off by default for now : -inline-generic-switch-cost=false This change was originally proposed by Haicheng in D29870. Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier Reviewed By: hans Subscribers: joerg, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D31085 llvm-svn: 301649
* Memory intrinsic value profile optimization: Avoid divide by 0Teresa Johnson2017-04-281-0/+19
| | | | | | | | | | | | | | Summary: Skip memops if the total value profiled count is 0, we can't correctly scale up the counts and there is no point anyway. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32624 llvm-svn: 301645
* [DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ↵Simon Pilgrim2017-04-281-17/+11
| | | | | | | | ASHR and INSERT_VECTOR_ELT (reapplied) Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element. llvm-svn: 301644
* [X86][SSE] Added new tests from D32416 to show codegen deltaSimon Pilgrim2017-04-281-0/+87
| | | | llvm-svn: 301641
* [X86][SSE] Renames all ones test to better match type.Simon Pilgrim2017-04-281-130/+204
| | | | | | Added 8f32/4f64 optsize tests discussed on D32416 llvm-svn: 301639
* [X86][SSE] Add codegen test for _mm_set_pd1 (PR32827)Simon Pilgrim2017-04-281-0/+16
| | | | llvm-svn: 301638
* [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values.Andrew Ng2017-04-281-45/+48
| | | | | | | | | | | | | | | This is a follow up to the fix in r298360 to improve the handling of debug values when redundant LEAs are removed. The fix in r298360 effectively discarded the debug values. This patch now attempts to preserve the debug values by using the DWARF DW_OP_stack_value operation via prependDIExpr. Moved functions appendOffset and prependDIExpr from Local.cpp to DebugInfoMetadata.cpp and made them available as static member functions of DIExpression. Differential Revision: https://reviews.llvm.org/D31604 llvm-svn: 301630
* [ARM] GlobalISel: Tighten test. NFCDiana Picus2017-04-281-27/+27
| | | | | | Explicitly check types and load sizes in the IRTranslator test. llvm-svn: 301627
* [EarlyCSE] Mark the condition of assume intrinsic as trueMax Kazantsev2017-04-281-0/+190
| | | | | | | | | | | | | | EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions. Reviewers: sanjoy, reames, apilipenko, anna, skatkov Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32482 llvm-svn: 301625
* [EarlyCSE] Remove guards with conditions known to be trueMax Kazantsev2017-04-281-0/+156
| | | | | | | | | | | | | | | | | If a condition is calculated only once, and there are multiple guards on this condition, we should be able to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and mark its condition as 'true' for future consideration. Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin Reviewed By: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32476 llvm-svn: 301623
* [StackMaps] Increase the size of the "location size" fieldSanjoy Das2017-04-2819-339/+1065
| | | | | | | | | | | | | | | | | | | | Summary: In some cases LLVM (especially the SLP vectorizer) will create vectors that are 256 bytes (or larger). Given that this is intentional[0] is likely to get more common, this patch updates the StackMap binary format to deal with the spill locations for said vectors. This change also bumps the stack map version from 2 to 3. [0]: https://reviews.llvm.org/D32533#738350 Reviewers: reames, kavon, skatkov, javed.absar Subscribers: mcrosier, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D32629 llvm-svn: 301615
* COFF Import: expose both symbolsSaleem Abdulrasool2017-04-282-0/+7
| | | | | | | | | COFF Import libraries which use the obsolete CONSTANT export are supposed to get two symbols, one with the `_imp_` prefix and one without. Ensure that we expose both for iteration. This is necessary to fix the librarian with COFF CONSTANT exports. llvm-svn: 301614
* [llvm-pdbdump] Allow printing only a portion of a stream.Zachary Turner2017-04-281-0/+47
| | | | | | | | | | | | When dumping raw data from a stream, you might know the offset of a certain record you're interested in, as well as how long that record is. Previously, you had to dump the entire stream and wade through the bytes to find the interesting record. This patch allows you to specify an offset and length on the command line, and it will only dump the requested range. llvm-svn: 301607
* [WebAssembly] Add some tests for wasm MC layerSam Clegg2017-04-283-3/+118
| | | | | | | | Subscribers: jfb, dschuff Differential Revision: https://reviews.llvm.org/D32558 llvm-svn: 301606
* [InstCombine] fix matcher to bind to specific operand (PR32830)Sanjay Patel2017-04-271-0/+20
| | | | | | | Matching any random value would be very wrong: https://bugs.llvm.org/show_bug.cgi?id=32830 llvm-svn: 301594
* [asan] Fix dead stripping of globals on Linux.Evgeniy Stepanov2017-04-275-14/+20
| | | | | | | | | | | | | | | | | | | | | | Use a combination of !associated, comdat, @llvm.compiler.used and custom sections to allow dead stripping of globals and their asan metadata. Sometimes. Currently this works on LLD, which supports SHF_LINK_ORDER with sh_link pointing to the associated section. This also works on BFD, which seems to treat comdats as all-or-nothing with respect to linker GC. There is a weird quirk where the "first" global in each link is never GC-ed because of the section symbols. At this moment it does not work on Gold (as in the globals are never stripped). This is a second re-land of r298158. This time, this feature is limited to -fdata-sections builds. llvm-svn: 301587
* [asan] Put ctor/dtor in comdat.Evgeniy Stepanov2017-04-272-1/+13
| | | | | | | | | | | | | | | | | | | | When possible, put ASan ctor/dtor in comdat. The only reason not to is global registration, which can be TU-specific. This is not the case when there are no instrumented globals. This is also limited to ELF targets, because MachO does not have comdat, and COFF linkers may GC comdat constructors. The benefit of this is a lot less __asan_init() calls: one per DSO instead of one per TU. It's also necessary for the upcoming gc-sections-for-globals change on Linux, where multiple references to section start symbols trigger quadratic behaviour in gold linker. This is a second re-land of r298756. This time with a flag to disable the whole thing to avoid a bug in the gold linker: https://sourceware.org/bugzilla/show_bug.cgi?id=19002 llvm-svn: 301586
* [X86][SSE] Add tests for broadcast from larger vector loadsSimon Pilgrim2017-04-271-0/+69
| | | | llvm-svn: 301583
* [llvm-readobj] Dump COFF Resources section.Zachary Turner2017-04-271-0/+19
| | | | | | | | | | | | This patch dumps the raw bytes of the .rsrc sections that are present in COFF object and executable files. Subsequent patches will parse this information and dump in a more human readable format. Differential Revision: https://reviews.llvm.org/D32463 Patch By: Eric Beckmann llvm-svn: 301578
* [PM/LoopUnswitch] Introduce a new, simpler loop unswitch pass.Chandler Carruth2017-04-2729-0/+1719
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, this pass only focuses on *trivial* loop unswitching. At that reduced problem it remains significantly better than the current loop unswitch: - Old pass is worse than cubic complexity. New pass is (I think) linear. - New pass is much simpler in its design by focusing on full unswitching. (See below for details on this). - New pass doesn't carry state for thresholds between pass iterations. - New pass doesn't carry state for correctness (both miscompile and infloop) between pass iterations. - New pass produces substantially better code after unswitching. - New pass can handle more trivial unswitch cases. - New pass doesn't recompute the dominator tree for the entire function and instead incrementally updates it. I've ported all of the trivial unswitching test cases from the old pass to the new one to make sure that major functionality isn't lost in the process. For several of the test cases I've worked to improve the precision and rigor of the CHECKs, but for many I've just updated them to handle the new IR produced. My initial motivation was the fact that the old pass carried state in very unreliable ways between pass iterations, and these mechansims were incompatible with the new pass manager. However, I discovered many more improvements to make along the way. This pass makes two very significant assumptions that enable most of these improvements: 1) Focus on *full* unswitching -- that is, completely removing whatever control flow construct is being unswitched from the loop. In the case of trivial unswitching, this means removing the trivial (exiting) edge. In non-trivial unswitching, this means removing the branch or switch itself. This is in opposition to *partial* unswitching where some part of the unswitched control flow remains in the loop. Partial unswitching only really applies to switches and to folded branches. These are very similar to full unrolling and partial unrolling. The full form is an effective canonicalization, the partial form needs a complex cost model, cannot be iterated, isn't canonicalizing, and should be a separate pass that runs very late (much like unrolling). 2) Leverage LLVM's Loop machinery to the fullest. The original unswitch dates from a time when a great deal of LLVM's loop infrastructure was missing, ineffective, and/or unreliable. As a consequence, a lot of complexity was added which we no longer need. With these two overarching principles, I think we can build a fast and effective unswitcher that fits in well in the new PM and in the canonicalization pipeline. Some of the remaining functionality around partial unswitching may not be relevant today (not many test cases or benchmarks I can find) but if they are I'd like to add support for them as a separate layer that runs very late in the pipeline. Purely to make reviewing and introducing this code more manageable, I've split this into first a trivial-unswitch-only pass and in the next patch I'll add support for full non-trivial unswitching against a *fixed* threshold, exactly like full unrolling. I even plan to re-use the unrolling thresholds, as these are incredibly similar cost tradeoffs: we're cloning a loop body in order to end up with simplified control flow. We should only do that when the total growth is reasonably small. One of the biggest changes with this pass compared to the previous one is that previously, each individual trivial exiting edge from a switch was unswitched separately as a branch. Now, we unswitch the entire switch at once, with cases going to the various destinations. This lets us unswitch multiple exiting edges in a single operation and also avoids numerous extremely bad behaviors, where we would introduce 1000s of branches to test for thousands of possible values, all of which would take the exact same exit path bypassing the loop. Now we will use a switch with 1000s of cases that can be efficiently lowered into a jumptable. This avoids relying on somehow forming a switch out of the branches or getting horrible code if that fails for any reason. Another significant change is that this pass actively updates the CFG based on unswitching. For trivial unswitching, this is actually very easy because of the definition of loop simplified form. Doing this makes the code coming out of loop unswitch dramatically more friendly. We still should run loop-simplifycfg (at the least) after this to clean up, but it will have to do a lot less work. Finally, this pass makes much fewer attempts to simplify instructions based on the unswitch. Something like loop-instsimplify, instcombine, or GVN can be used to do increasingly powerful simplifications based on the now dominating predicate. The old simplifications are things that something like loop-instsimplify should get today or a very, very basic loop-instcombine could get. Keeping that logic separate is a big simplifying technique. Most of the code in this pass that isn't in the old one has to do with achieving specific goals: - Updating the dominator tree as we go - Unswitching all cases in a switch in a single step. I think it is still shorter than just the trivial unswitching code in the old pass despite having this functionality. Differential Revision: https://reviews.llvm.org/D32409 llvm-svn: 301576
* [GlobalOpt] Correctly update metadata when localizing a global.Eli Friedman2017-04-271-0/+70
| | | | | | | | | Just calling dropAllReferences leaves pointers to the ConstantExpr behind, so we would eventually crash with a null pointer dereference. Differential Revision: https://reviews.llvm.org/D32551 llvm-svn: 301575
* Use a pointer type for target frame indices during statepoint loweringSanjoy Das2017-04-271-1/+39
| | | | | | | | | | | | | | | | | Summary: The type of the target frame index is intptr, not the type of the value we're going to store into it. Without this change we crash in the attached test case when trying to type-legalize a TargetFrameIndex. Patchpoint lowering types the target frame index as intptr as well. Reviewers: reames, bogner, arsenm Subscribers: arsenm, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D32256 llvm-svn: 301566
* [PartialInlining]: Improve partial inlining to handle complex conditionsXinliang David Li2017-04-277-7/+315
| | | | | | Differential Revision: http://reviews.llvm.org/D32249 llvm-svn: 301561
* [x86] add minimal tests for potential size-changing vsel transforms; NFCSanjay Patel2017-04-272-325/+643
| | | | llvm-svn: 301554
* [AMDGPU] DPP: add support for GFX9Sam Kolton2017-04-272-138/+147
| | | | | | | | | | Reviewers: artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D32588 llvm-svn: 301551
* [mips][microMIPS] Adding code size reduction pass for MicroMIPSZoran Jovanovic2017-04-273-19/+30
| | | | | | | | | | | | | | | Author: milena.vujosevic.janicic Reviewers: sdardis The code implements size reduction pass for MicroMIPS. Load and store instructions are examined and transformed, if possible. lw32 instruction is transformed into 16-bit instruction lwsp sw32 instruction is transformed into 16-bit instruction swsp Arithmetic instrcutions are examined and transformed, if possible. addu32 instruction is transformed into 16-bit instruction addu16 subu32 instruction is transformed into 16-bit instruction subu16 Differential Revision: https://reviews.llvm.org/D15144 llvm-svn: 301540
* [ARM] GlobalISel: Fix extended stack operandsDiana Picus2017-04-272-4/+52
| | | | | | | | | | | | | | Fix a crash when trying to extend a value passed as a sign- or zero-extended stack parameter. The cause of the crash was that we were setting the size of the loaded value to 32 bits, and then tyring to extend again to 32 bits. This patch addresses the issue by also introducing a G_TRUNC after the load. This will leave the unused bits to their original values set by the caller, while being consistent about the types. For values that are not extended, we just use a smaller load. llvm-svn: 301531
* 2 tests that were lost in rL301390Andrew V. Tischenko2017-04-272-0/+24
| | | | llvm-svn: 301529
* [llvm-dwarfdump] - Change format for .gdb_index dump.George Rimar2017-04-271-2/+2
| | | | | | | | | | | | | | It is useful to output size of ranges when address ranges section of .gdb_index is dumped. It helps to compare outputs produced by different linkers, for example. In that case address ranges can look very different, when they are the same at fact. Difference comes from different low address because of different address of .text. Differential revision: https://reviews.llvm.org/D32492 llvm-svn: 301527
* [GlobalISel][X86] handle not symmetric G_COPYIgor Breger2017-04-273-0/+134
| | | | | | | | | | | | | | Summary: handle not symmetric G_COPY Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D32420 llvm-svn: 301523
* AMDGPU: Fix assert in schedulerKonstantin Zhuravlyov2017-04-271-0/+95
| | | | | | | | Assert is triggered if DBG_VALUE is first instruction in BB Differential Revision: https://reviews.llvm.org/D32572 llvm-svn: 301511
* Disable GVN Hoist due to still more bugs being found in it. There isChandler Carruth2017-04-273-7/+7
| | | | | | | | | | also a discussion about exactly what we should do prior to re-enabling it. The current bug is http://llvm.org/PR32821 and the discussion about this is in the review thread for r300200. llvm-svn: 301505
* Revert r301487: Replace HashString algorithm with xxHash64Rui Ueyama2017-04-263-19/+16
| | | | | | This reverts commit r301487 to make buildbots green. llvm-svn: 301491
OpenPOWER on IntegriCloud