summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [libFuzzer] remove usage of the old coverage instrumentationKostya Serebryany2017-02-283-70/+0
| | | | llvm-svn: 296536
* Fix PR 24415 (at least), by making our post-dominator tree behavior sane.Daniel Berlin2017-02-281-37/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, our post-dom tree tries to ignore and remove the effects of infinite loops. It fails miserably at this, because it tries to do it ahead of time, and thus can only detect self-loops, and any other type of infinite loop, it pretends doesn't exist at all. This can, in a bunch of cases, lead to wrong answers and a completely empty post-dom tree. Wrong answer: ``` declare void foo() define internal void @f() { entry: br i1 undef, label %bb35, label %bb3.i bb3.i: call void @foo() br label %bb3.i bb35.loopexit3: br label %bb35 bb35: ret void } ``` We get: ``` Inorder PostDominator Tree: [1] <<exit node>> {0,7} [2] %bb35 {1,6} [3] %bb35.loopexit3 {2,3} [3] %entry {4,5} ``` This is a trivial modification of the testcase for PR 6047 Note that we pretend bb3.i doesn't exist. We also pretend that bb35 post-dominates entry. While it's true that it does not exit in a theoretical sense, it's not really helpful to try to ignore the effect and pretend that bb35 post-dominates entry. Worse, we pretend the infinite loop does nothing (it's usually considered a side-effect), and doesn't even exist, even when it calls a function. Sadly, this makes it impossible to use when you are trying to move code safely. All compilers also create virtual or real single exit nodes (including us), and connect infinite loops there (which this patch does). In fact, others have worked around our behavior here, to the point of building their own post-dom trees: https://zneak.github.io/fcd/2016/02/17/structuring.html and pointing out the region infrastructure is near-useless for them with postdom in this state :( Completely empty post-dom tree: ``` define void @spam() #0 { bb: br label %bb1 bb1: ; preds = %bb1, %bb br label %bb1 bb2: ; No predecessors! ret void } ``` Printing analysis 'Post-Dominator Tree Construction' for function 'foo': =============================-------------------------------- Inorder PostDominator Tree: [1] <<exit node>> {0,1} :( (note that even if you ignore the effects of infinite loops, bb2 should be present as an exit node that post-dominates nothing). This patch changes post-dom to properly handle infinite loops and does root finding during calculation to prevent empty tress in such cases. We match gcc's (and the canonical theoretical) behavior for infinite loops (find the backedge, connect it to the exit block). Testcases coming as soon as i finish running this on a ton of random graphs :) Reviewers: chandlerc, davide Subscribers: bryant, llvm-commits Differential Revision: https://reviews.llvm.org/D29705 llvm-svn: 296535
* [Hexagon] Fix instruction selection for sign-extending i1 to i64Krzysztof Parzyszek2017-02-281-27/+28
| | | | llvm-svn: 296532
* Actually add error handling to unpacking the dyld compact bind andKevin Enderby2017-02-281-11/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | other tables. Providing a helpful error message to what the error is and where the error occurred based on which opcode it was associated with. There have been handful of bug fixes dealing with bad bind info in object files, r294021 and r249845, which only put a band aid on the problem after a bad bind table was created after unpacking from its compact info. In these cases a bind table should have never been created and an error should have simply been generated. This change puts in place the plumbing to allow checking and returning of an error when the compact info is unpacked. This follows the model of iterators that can fail that Lang Hanes designed when fixing the problem for bad archives r275316 (or r275361). This change uses one of the existing test cases that now causes an error instead of printing <<bad library ordinal>> after a bad bind table is created. The error uses the offset into the opcode table as shown with the macOS dyldinfo(1) tool to indicate where the error is and which opcode and which parameter is in error. For example the exiting test case has this lazy binding opcode table: % dyldinfo -opcodes test/tools/llvm-objdump/Inputs/bad-ordinal.macho-x86_64 … lazy binding opcodes: 0x0000 BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB(0x02, 0x00000010) 0x0002 BIND_OPCODE_SET_DYLIB_ORDINAL_IMM(2) In the test case the binary only has one library so setting the library ordinal to the value of 2 in the BIND_OPCODE_SET_DYLIB_ORDINAL_IMM opcode at 0x0002 above is an error. This now produces this error message: % llvm-objdump -lazy-bind bad-ordinal.macho-x86_64 … llvm-objdump: 'bad-ordinal.macho-x86_64': truncated or malformed object (for BIND_OPCODE_SET_DYLIB_ORDINAL_ULEB bad library ordinal: 2 (max 1) for opcode at: 0x2) This change provides the plumbing for the error handling and one example of an error message. Other error checks and test cases will be added in follow on commits. llvm-svn: 296527
* Mark some libFuzzer tests as XFAIL'd on DarwinMehdi Amini2017-02-284-0/+7
| | | | | | | | We're bringing up a bot on Green Dragon right now: http://green.lab.llvm.org/green/view/Experimental/job/libFuzzer llvm-svn: 296526
* AMDGPU: Fix types for VOP_I16_I16_I16Matt Arsenault2017-02-281-1/+1
| | | | llvm-svn: 296523
* AMDGPU: Add definition for v_swap_b32Matt Arsenault2017-02-281-4/+31
| | | | | | | | This is somewhat tricky because there are two pairs of tied operands, and it isn't allowed to be VOP3 encoded. llvm-svn: 296519
* AMDGPU: Add definition for v_xad_u32Matt Arsenault2017-02-281-0/+2
| | | | llvm-svn: 296515
* [DWARFv5] Emit new unit header format.Paul Robinson2017-02-287-15/+58
| | | | | | | | | Requesting DWARF v5 will now get you the new compile-unit and type-unit headers. llvm-dwarfdump will also recognize them. Differential Revision: http://reviews.llvm.org/D30206 llvm-svn: 296514
* AMDGPU: Add ds_nop to assemblerMatt Arsenault2017-02-281-1/+21
| | | | llvm-svn: 296513
* AMDGPU: Add definitions for ds_{read|write}_b{96|128}Matt Arsenault2017-02-281-4/+19
| | | | | | | | | It's not clear to me if this is always better than doing ds_write2_b64 This adds the constraint of a 128-bit register input instead of a pair of 64-bit. llvm-svn: 296512
* [AMDGPU] Add second pass of the schedulerStanislav Mekhanoshin2017-02-282-7/+126
| | | | | | | | | | | If during scheduling we have identified that we cannot keep optimistic occupancy increase critical register pressure limit and try scheduling of the whole function again. In this case blocks with smaller pressure will have a chance for better scheduling. Differential Revision: https://reviews.llvm.org/D30442 llvm-svn: 296506
* [DAGCombiner] use dyn_cast values in foldSelectOfConstants(); NFCSanjay Patel2017-02-281-6/+8
| | | | llvm-svn: 296502
* Fix -Wcovered-switch-default warning.Zachary Turner2017-02-281-2/+0
| | | | llvm-svn: 296501
* [LCG] Fix EXPENSIVE_CHECKS typo. NFCFrancis Visoiu Mistrih2017-02-281-5/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D30434 llvm-svn: 296500
* Add function importing info from samplepgo profile to the module summary.Dehao Chen2017-02-285-16/+50
| | | | | | | | | | | | | | Summary: For SamplePGO, the profile may contain cross-module inline stacks. As we need to make sure the profile annotation happens when all the hot inline stacks are expanded, we need to pass this info to the module importer so that it can import proper functions if necessary. This patch implemented this feature by emitting cross-module targets as part of function entry metadata. In the module-summary phase, the metadata is used to build call edges that points to functions need to be imported. Reviewers: mehdi_amini, tejohnson Reviewed By: tejohnson Subscribers: davidxl, llvm-commits Differential Revision: https://reviews.llvm.org/D30053 llvm-svn: 296498
* [PDB] Add BinaryStreamError.Zachary Turner2017-02-285-25/+76
| | | | | | | This migrates the stream code away from MSFError to using its own custom Error class. llvm-svn: 296494
* Set default CPU for OpenBSD/arm to Cortex-A8Brad Smith2017-02-281-0/+1
| | | | llvm-svn: 296493
* [AMDGPU] New method to estimate register pressureStanislav Mekhanoshin2017-02-282-21/+150
| | | | | | | | | | | | | | | | | | | | | | | | This change introduces new method to estimate register pressure in GCNScheduler. Standard RPTracker gives huge error due to the following reasons: 1. It does not account for live-ins or live-outs if value is not used in the region itself. That creates a huge error in a very common case if there are a lot of live-thu registers. 2. It does not properly count subregs. 3. It assumes a register used as an input operand can be reused as an output. This is not always possible by itself, this is not what RA will finally do in many cases for various reasons not limited to RA's inability to do so, and this is not so if the value is actually a live-thu. In addition we can now see clear separation between live-in pressure which we cannot change with the scheduling and tentative pressure which we can change. Differential Revision: https://reviews.llvm.org/D30439 llvm-svn: 296491
* [AMDGPU] Change amd_kernel_code_t's minor version to 1Konstantin Zhuravlyov2017-02-281-1/+1
| | | | | | | | - We do emit amd_kernel_code_t v1.1 Differential Revision: https://reviews.llvm.org/D30433 llvm-svn: 296489
* Strip debug info when inlining into a nodebug function.Adrian Prantl2017-02-281-12/+30
| | | | | | | | | | | | | The LLVM backend cannot produce any debug info for an llvm::Function without a DISubprogram attachment. When inlining a debug-info-carrying function into a nodebug function, there is therefore no reason to keep any debug info intrinsic calls or debug locations on the instructions. This fixes a problem discovered in PR32042. rdar://problem/30679307 llvm-svn: 296488
* [DAGISel] When checking if chain node is foldable, make sure the ↵Craig Topper2017-02-281-1/+1
| | | | | | | | intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. llvm-svn: 296486
* [AMDGPU] Fix read-undef flags when schedule is revertedStanislav Mekhanoshin2017-02-281-12/+15
| | | | | | | | | | | | | If two subregs of the same register are defined and we need to revert schedule changing def order, we will end up with both instructions having def,read-undef flags because adjustLaneLiveness() will only set this flag but will not remove it. Fix this by removing read-undef flags before calling adjustLaneLiveness. Differential Revision: https://reviews.llvm.org/D30428 llvm-svn: 296484
* [Stack Protection] Add diagnostic information for why stack protection was ↵David Bozier2017-02-282-2/+41
| | | | | | | | | | | | | | applied to a function Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which functions have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds a remark that is reported by the stack protection code when an instruction or attribute is encountered that causes SSP to be applied. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 296483
* [mips] Fix 64bit slt/sltu/nor with immediatesSimon Dardis2017-02-283-6/+43
| | | | | | | | | | Patch By: Alexander Richardson Reviewers: atanasyan, theraven, sdardis Differential Revision: https://reviews.llvm.org/D30330 llvm-svn: 296482
* Revert r296474 - [globalisel] Change LLT constructor string into an LLT ↵Daniel Sanders2017-02-287-77/+61
| | | | | | | | subclass that knows how to generate it. There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1. llvm-svn: 296478
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2017-02-284-371/+380
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle llvm-svn: 296476
* [globalisel] Change LLT constructor string into an LLT subclass that knows ↵Daniel Sanders2017-02-287-61/+77
| | | | | | | | | | | | | | | | | | how to generate it. Summary: This will allow future patches to inspect the details of the LLT. The implementation is now split between the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns. Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem. Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30046 llvm-svn: 296474
* [ARM] GlobalISel: Lower i32 and fp call parameters on the stackDiana Picus2017-02-281-7/+31
| | | | | | | | | | | | Lower i32, float and double parameters that need to live on the stack. This boils down to creating some G_GEPs starting from the stack pointer and storing the values there. During the process we also keep track of the stack size and use the final value in the ADJCALLSTACKDOWN/UP instructions. We currently assert for smaller types, since they usually require extensions. They will be handled in a separate patch. llvm-svn: 296473
* [ARM] GlobalISel: Select 32-bit G_CONSTANTDiana Picus2017-02-281-0/+11
| | | | | | Put it into a register by means of a MOVi. llvm-svn: 296471
* [ARM] GlobalISel: Add mapping for G_CONSTANTDiana Picus2017-02-281-0/+1
| | | | | | | Like G_FRAME_INDEX, G_CONSTANT has one register operand and one non-register operand. llvm-svn: 296469
* [ARM] GlobalISel: Legalize 32-bit constantsDiana Picus2017-02-281-0/+2
| | | | llvm-svn: 296468
* Reformat a blank line.NAKAMURA Takumi2017-02-281-1/+1
| | | | llvm-svn: 296464
* Revert r296442 (and r296443), "Allow externally dlopen-ed libraries to be ↵NAKAMURA Takumi2017-02-282-51/+14
| | | | | | | | registered as permanent libraries." It broke clang/test/Analysis/checker-plugins.c llvm-svn: 296463
* [ARM] GlobalISel: Select G_GEPDiana Picus2017-02-281-0/+1
| | | | | | At this point, G_GEP is just an add, so we treat it exactly like a G_ADD. llvm-svn: 296462
* [ARM] Diagnose PC-writing instructions in IT blocksOliver Stannard2017-02-281-4/+7
| | | | | | | | | | | | In Thumb2, instructions which write to the PC are UNPREDICTABLE if they are in an IT block but not the last instruction in the block. Previously, we only diagnosed this for LDM instructions, this patch extends the diagnostic to cover all of the relevant instructions. Differential Revision: https://reviews.llvm.org/D30398 llvm-svn: 296459
* [ARM] GlobalISel: Add reg bank mapping for G_GEPDiana Picus2017-02-281-0/+1
| | | | | | This should be the same as the mapping for G_ADD etc. llvm-svn: 296455
* [ARM] GlobalISel: Legalize G_GEP with 32-bit offsetsDiana Picus2017-02-281-0/+3
| | | | | | | At the moment we're only interested in GEPs for putting call parameters on the stack, so we'll stick to 32-bit offsets. llvm-svn: 296452
* Test commit, fix typo, NFC.Vadzim Dambrouski2017-02-281-1/+1
| | | | llvm-svn: 296447
* Fix Win bots.Vassil Vassilev2017-02-281-2/+2
| | | | llvm-svn: 296443
* Allow externally dlopen-ed libraries to be registered as permanent libraries.Vassil Vassilev2017-02-282-13/+50
| | | | | | | | | | This is also useful in cases when llvm is in a shared library. First we dlopen the llvm shared library and then we register it as a permanent library in order to keep the JIT and other services working. Patch reviewed by Vedant Kumar (D29955)! llvm-svn: 296442
* [ImplicitNullCheck] Add alias analysis usageSanjoy Das2017-02-281-27/+75
| | | | | | | | | | | | | | | | | | | Summary: With this change ImplicitNullCheck optimization uses alias analysis and can use load/store memory access for implicit null check if there are other load/store before but memory accesses do not alias. Patch by Serguei Katkov! Reviewers: sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30331 llvm-svn: 296440
* Empty line. NFCIXin Tong2017-02-281-1/+0
| | | | llvm-svn: 296438
* [LoopUnswitch] Common pushing LIC's user to worklist.Xin Tong2017-02-281-4/+2
| | | | llvm-svn: 296432
* Revert "Add MIR-level outlining pass"Matthias Braun2017-02-286-1508/+0
| | | | | | | | Revert Machine Outliner for now, as it breaks the asan bot. This reverts commit r296418. llvm-svn: 296426
* Add MIR-level outlining passMatthias Braun2017-02-286-0/+1508
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a patch for the outliner described in the RFC at: http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html The outliner is a code-size reduction pass which works by finding repeated sequences of instructions in a program, and replacing them with calls to functions. This is useful to people working in low-memory environments, where sacrificing performance for space is acceptable. This adds an interprocedural outliner directly before printing assembly. For reference on how this would work, this patch also includes X86 target hooks and an X86 test. The outliner is run like so: clang -mno-red-zone -mllvm -enable-machine-outliner file.c Patch by Jessica Paquette<jpaquette@apple.com>! rdar://29166825 Differential Revision: https://reviews.llvm.org/D26872 llvm-svn: 296418
* [CGP] Split some critical edges coming out of indirect branchesMichael Kuperstein2017-02-281-0/+157
| | | | | | | | | | | | | | | | | | | | | | Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the *direct* branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296416
* [PDB] Make streams carry their own endianness.Zachary Turner2017-02-2818-92/+80
| | | | | | | | | | | | | Before the endianness was specified on each call to read or write of the StreamReader / StreamWriter, but in practice it's extremely rare for streams to have data encoded in multiple different endiannesses, so we should optimize for the 99% use case. This makes the code cleaner and more general, but otherwise has NFC. llvm-svn: 296415
* [DebugInfo] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2017-02-279-46/+83
| | | | | | other minor fixes (NFC). llvm-svn: 296413
* [SLP] Load sorting should not try to sort things that aren't loads.Michael Kuperstein2017-02-271-0/+5
| | | | | | | We may get a VL where the first element is a load, but the others aren't. Trying to sort such VLs can only lead to sorrow. llvm-svn: 296411
OpenPOWER on IntegriCloud