summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Whoops, remove trailing whitespace.Alex Rosenberg2015-08-271-1/+1
| | | | llvm-svn: 246141
* isKnownNonNull needs to consider globals in non-zero address spaces.Pete Cooper2015-08-271-2/+5
| | | | | | | | | Globals in address spaces other than one may have 0 as a valid address, so we should not assume that they can be null. Reviewed by Philip Reames. llvm-svn: 246137
* Allow value forwarding past release fences in EarlyCSEPhilip Reames2015-08-271-0/+11
| | | | | | | | | | | | A release fence acts as a publication barrier for stores within the current thread to become visible to other threads which might observe the release fence. It does not require the current thread to observe stores performed on other threads. As a result, we can allow store-load and load-store forwarding across a release fence. We do need to make sure that stores before the fence can't be eliminated even if there's another store to the same location after the fence. In theory, we could reorder the second store above the fence and *then* eliminate the former, but we can't do this if the stores are on opposite sides of the fence. Note: While more aggressive then what's there, this patch is still implementing a really conservative ordering. In particular, I'm not trying to exploit undefined behavior via races, or the fact that the LangRef says only 'atomic' accesses are ordered w.r.t. fences. Differential Revision: http://reviews.llvm.org/D11434 llvm-svn: 246134
* [RewriteStatepointsForGC] Reduce the number of new instructions for base ↵Philip Reames2015-08-271-3/+57
| | | | | | | | | | | | | | pointers When computing base pointers, we introduce new instructions to propagate the base of existing instructions which might not be bases. However, the algorithm doesn't make any effort to recognize when the new instruction to be inserted is the same as an existing one already in the IR. Since this is happening immediately before rewriting, we don't really have a chance to fix it after the pass runs without teaching loop passes about statepoints. I'm really not thrilled with this patch. I've rewritten it 4 different ways now, but this is the best I've come up with. The case where the new instruction is just the original base defining value could be merged into the existing algorithm with some complexity. The problem is that we might have something like an extractelement from a phi of two vectors. It may be trivially obvious that the base of the 0th element is an existing instruction, but I can't see how to make the algorithm itself figure that out. Thus, I resort to the call to SimplifyInstruction instead. Note that we can only adjust the instructions we've inserted ourselves. The live sets are still being tracked in side structures at this point in the code. We can't easily muck with instructions which might be in them. Long term, I'm really thinking we need to materialize the live pointer sets explicitly in the IR somehow rather than using side structures to track them. Differential Revision: http://reviews.llvm.org/D12004 llvm-svn: 246133
* Improved printing of analysis diagnostics in the loop vectorizer.Tyler Nowicki2015-08-271-18/+26
| | | | | | | | This patch ensures that every analysis diagnostic produced by the vectorizer will be printed if the loop has a vectorization hint on it. The condition has also been improved to prevent printing when a disabling hint is specified. llvm-svn: 246132
* Fixed a bug that edge weights are not assigned correctly when lowering ↵Cong Hou2015-08-271-1/+1
| | | | | | | | | | switch statement. This is a one-line-change patch that moves the update to UnhandledWeights to the correct position: it should be updated for all clusters instead of just range clusters. Differential Revision: http://reviews.llvm.org/D12391 llvm-svn: 246129
* [SimplifyCFG] Prune code from a provably unreachable switch defaultPhilip Reames2015-08-261-0/+17
| | | | | | | | | | As Sanjoy pointed out over in http://reviews.llvm.org/D11819, a switch on an icmp should always be able to become a branch instruction. This patch generalizes that notion slightly to prove that the default case of a switch is unreachable if the cases completely cover all possible bit patterns in the condition. Once that's done, the switch to branch conversion kicks in just fine. Note: Duplicate case values are disallowed by the LangRef and verifier. Differential Revision: http://reviews.llvm.org/D11995 llvm-svn: 246125
* [PowerPC] Remove unnecessary braces in PPCVSXFMAMutateHal Finkel2015-08-261-3/+3
| | | | | | Address Eric's post-commit review of r245741. NFC. llvm-svn: 246121
* [NVPTX] Let NVPTX backend detect integer min and max patterns.Bjarke Hammersholt Roune2015-08-261-0/+64
| | | | | | | | | | | | | | Summary: Let NVPTX backend detect integer min and max patterns during isel and emit intrinsics that enable hardware support. Reviewers: jholewinski, meheff, jingyue Subscribers: arsenm, llvm-commits, meheff, jingyue, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D12377 llvm-svn: 246107
* [ARM] Use BranchProbability::scale() to scale an integer with a probability ↵Cong Hou2015-08-261-9/+3
| | | | | | | | | | in ARMBaseInstrInfo.cpp, Previously in isProfitableToIfCvt() in ARMBaseInstrInfo.cpp, the multiplication between an integer and a branch probability is done manually in an unsafe way that may lead to overflow. This patch corrects those cases by using BranchProbability's member function scale() to avoid overflow (which stores the intermediate result in int64). Differential Revision: http://reviews.llvm.org/D12295 llvm-svn: 246106
* Assign weights to edges to jump table / bit test header when lowering switch ↵Cong Hou2015-08-262-11/+22
| | | | | | | | | | statement. Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly. Differential Revision: http://reviews.llvm.org/D12166#fae6eca7 llvm-svn: 246104
* WebAssembly: NFC comment updateJF Bastien2015-08-261-1/+1
| | | | llvm-svn: 246101
* DI: Make Subprogram definitions 'distinct'Duncan P. N. Exon Smith2015-08-261-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change `DIBuilder` always to produce 'distinct' nodes when creating `DISubprogram` definitions. I measured a ~5% memory improvement in the link step (of ld64) when using `-flto -g`. `DISubprogram`s are used in two ways in the debug info graph. Some are definitions, point at actual functions, and can't really be shared between compile units. With full debug info, these point down at their variables, forming uniquing cycles. These uniquing cycles are expensive to link between modules, since all unique nodes that reference them transitively need to be duplicated (see commit message for r244181 for more details). Others are declarations, primarily used for member functions in the type hierarchy. Definitions never show up there; instead, a definition points at its corresponding declaration node. I started by making all subprograms 'distinct'. However, that was too big a hammer: memory usage *increased* ~5% (net increase vs. this patch of ~10%) because the 'distinct' declarations undermine LTO type uniquing. This is a targeted fix for the definitions (where uniquing is an observable problem). A couple of notes: - There's an accompanying commit to update IRGen testcases in clang. - ^ That's what I'm using to test this commit. - In a follow-up, I'll change the verifier to require 'distinct' on definitions and add an upgrade to `BitcodeReader`. llvm-svn: 246098
* WebAssembly: handle private/internal globals.JF Bastien2015-08-261-22/+96
| | | | | | | | | | | | Things of note: - Other linkage types aren't handled yet. We'll figure it out with dynamic linking. - Special LLVM globals are either ignored, or error out for now. - TLS isn't supported yet (WebAssembly will have threads later). - There currently isn't a syntax for alignment, I left it in a comment so it's easy to hook up. - Undef is convereted to whatever the type's appropriate null value is. - assert versus report_fatal_error: follow what other AsmPrinters do, and assert only on what should have been caught elsewhere. llvm-svn: 246092
* [ms-inline-asm] Relax assertion around funky identifiers slightlyReid Kleckner2015-08-261-6/+8
| | | | | | | | | A corresponding clang change will make it so that clang can consume part of an assembler token. The assembler treats '.' as an identifier character while clang does not, so it's view of the token stream is a little different. llvm-svn: 246089
* [libFuzzer] fix minor inefficiency, PR24584Kostya Serebryany2015-08-261-1/+1
| | | | llvm-svn: 246087
* Fix LLVM C API for DataLayoutMehdi Amini2015-08-261-20/+15
| | | | | | | | | | | | | | | | | | | | | | | | | We removed access to the DataLayout on the TargetMachine and deprecated the C API function LLVMGetTargetMachineData() in r243114. However the way I tried to be backward compatible was broken: I changed the wrapper of the TargetMachine to be a structure that includes the DataLayout as well. However the TargetMachine is also wrapped by the ExecutionEngine, in the more classic way. A client using the TargetMachine wrapped by the ExecutionEngine and trying to get the DataLayout would break. It seems tricky to solve the problem completely in the C API implementation. This patch tries to address this backward compatibility in a more lighter way in the C++ API. The C API is restored in its original state and the removed C++ API is reintroduced, but privately. The C API is friended to the TargetMachine and should be the only consumer for this API. Reviewers: ributzka Differential Revision: http://reviews.llvm.org/D12263 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246082
* AMDGPU: Delete dead codeMatt Arsenault2015-08-263-68/+4
| | | | | | | | | | | | | | | | | There is no context where s_mov_b64 is emitted and could potentially be moved to the VALU. It is currently only emitted for materializing immediates, which can't be dependent on vector sources. The immediate splitting is already done when selecting constants. I'm not sure what contexts if any the register splitting would have been used before. Also clean up using s_mov_b64 in place of v_mov_b64_pseudo, although this isn't required and just skips the extra step of eliminating the copy from the SReg_64. llvm-svn: 246080
* AMDGPU: Don't reprocess instructions when splitting i64 bcntMatt Arsenault2015-08-261-4/+5
| | | | llvm-svn: 246079
* AMDGPU: Fix not moving users of s_bfe_i64 to VALUMatt Arsenault2015-08-261-0/+2
| | | | | | | This wouldn't propagate to users of the original BFE and would hit a verifier error. llvm-svn: 246078
* AMDGPU: Don't create intermediate SALU instructionsMatt Arsenault2015-08-262-27/+44
| | | | | | | | | | | | When splitting 64-bit operations, create the correct VALU instructions immediately. This was splitting things like s_or_b64 into the two s_or_b32s and then pushing the new instructions onto the worklist. There's no reason we need to do this intermediate step. llvm-svn: 246077
* SelectionDAGBuilder: Fix SPDescriptor not resetting GuardRegMatthias Braun2015-08-261-0/+1
| | | | | | | | This was causing problems when some functions use a GuardReg and some don't as can happen when mixing SelectionDAG and FastISel generated functions. llvm-svn: 246075
* FastISel: Avoid adding a successor block twice for degenerate IR.Matthias Braun2015-08-261-1/+5
| | | | | | | | This fixes http://llvm.org/PR24581 Differential Revision: http://reviews.llvm.org/D12350 llvm-svn: 246074
* Expose hasLiveCondCodeDef as a member function of the X86InstrInfo class. NFCAndrew Kaylor2015-08-262-1/+5
| | | | | | | | | This takes the existing static function hasLiveCondCodeDef and makes it a member function of the X86InstrInfo class. This is a useful utility function that an upcoming change would like to use. NFC. Patch by: Kevin B. Smith Differential Revision: http://reviews.llvm.org/D12371 llvm-svn: 246073
* Fix memory leak in sample profile pass.Diego Novillo2015-08-261-28/+23
| | | | | | | | | | | | | | | | The problem here were the function analyses invoked by the function pass manager from the new IPO pass. I looked at other IPO passes needing dominance information and the only one that requires it (partial inliner) does not use the standard dependency mechanism. This patch mimics what the partial inliner does to compute dominance, post-dominance and loop info. One thing I like about this approach is that I can delay the computation of all this until I actually need it. This should bring the ASAN buildbot back to green. If there's a better way to fix this, I'll do it in a follow-up patch. llvm-svn: 246066
* Revert "Fix LLVM C API for DataLayout"Mehdi Amini2015-08-261-8/+22
| | | | | | | | This reverts commit r246052. Third attempt, still unpleasant for some bots. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246057
* AMDGPU/SI: Report SIFixSGPRLiveRanges changed functionMatt Arsenault2015-08-261-1/+4
| | | | llvm-svn: 246056
* Fix LLVM C API for DataLayoutMehdi Amini2015-08-261-22/+8
| | | | | | | | | | | | | | | | | | | | | | | | | We removed access to the DataLayout on the TargetMachine and deprecated the C API function LLVMGetTargetMachineData() in r243114. However the way I tried to be backward compatible was broken: I changed the wrapper of the TargetMachine to be a structure that includes the DataLayout as well. However the TargetMachine is also wrapped by the ExecutionEngine, in the more classic way. A client using the TargetMachine wrapped by the ExecutionEngine and trying to get the DataLayout would break. It seems tricky to solve the problem completely in the C API implementation. This patch tries to address this backward compatibility in a more lighter way in the C++ API. The C API is restored in its original state and the removed C++ API is reintroduced, but privately. The C API is friended to the TargetMachine and should be the only consumer for this API. Reviewers: ributzka Differential Revision: http://reviews.llvm.org/D12263 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246052
* AMDGPU: Make sure to reserve super registersMatt Arsenault2015-08-262-16/+18
| | | | | | | | I think this could potentially have broken if one of the super registers were allocated that contain v254/v255. llvm-svn: 246051
* Revert "Fix LLVM C API for DataLayout"Mehdi Amini2015-08-261-8/+22
| | | | | | | | This reverts commit r246044. Build broken, still. It builds for me... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246049
* AMDGPU: Produce error on dynamic_stackallocMatt Arsenault2015-08-263-0/+19
| | | | llvm-svn: 246048
* [SimplifyLibCalls] Fix a typoDavid Majnemer2015-08-261-1/+1
| | | | | | | cbrt(sqrt(x)) calculates the sixth root, not the ninth root. cbrt(cbrt(x)) calculates the ninth root. llvm-svn: 246046
* Fix LLVM C API for DataLayoutMehdi Amini2015-08-261-22/+8
| | | | | | | | | | | | | | | | | | | | | | | | | We removed access to the DataLayout on the TargetMachine and deprecated the C API function LLVMGetTargetMachineData() in r243114. However the way I tried to be backward compatible was broken: I changed the wrapper of the TargetMachine to be a structure that includes the DataLayout as well. However the TargetMachine is also wrapped by the ExecutionEngine, in the more classic way. A client using the TargetMachine wrapped by the ExecutionEngine and trying to get the DataLayout would break. It seems tricky to solve the problem completely in the C API implementation. This patch tries to address this backward compatibility in a more lighter way in the C++ API. The C API is restored in its original state and the removed C++ API is reintroduced, but privately. The C API is friended to the TargetMachine and should be the only consumer for this API. Reviewers: ributzka Differential Revision: http://reviews.llvm.org/D12263 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 246044
* [SPARC] Fix stupid oversight in stack realignment support.James Y Knight2015-08-263-2/+37
| | | | | | | | | | | | | | | | | | | | If you're going to realign %sp to get object alignment properly (which the code does), and stack offsets and alignments are calculated going down from %fp (which they are), then the total stack size had better be a multiple of the alignment. LLVM did indeed ensure that. And then, after aligning, the sparc frame code added 96 (for sparcv8) to the frame size, making any requested alignment of 64-bytes or higher *guaranteed* to be misaligned. The test case added with r245668 even tests this exact scenario, and asserted the incorrect behavior, which I somehow failed to notice. D'oh. This change fixes the frame lowering code to align the stack size *after* adding the spill area, instead. Differential Revision: http://reviews.llvm.org/D12349 llvm-svn: 246042
* [llvm-mc] Ignore opcode size prefix in 64-bit CALL disassemblyVedant Kumar2015-08-261-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix for disassembling unusual instruction sequences in 64-bit mode w.r.t the CALL rel16 instruction. It might be desirable to move the check somewhere else, but it essentially mimics the special case handling with JCXZ in 16-bit mode. The current behavior accepts the opcode size prefix and causes the call's immediate to stop disassembling after 2 bytes. When debugging sequences of instructions with this pattern, the disassembler output becomes extremely unreliable and essentially useless (if you jump midway into what lldb thinks is a unified instruction, you'll lose %rip). So we ignore the prefix and consume all 4 bytes when disassembling a 64-bit mode binary. Note: in Vol. 2A 3-99 the Intel spec states that CALL rel16 is N.S. N.S. is defined as: Indicates an instruction syntax that requires an address override prefix in 64-bit mode and is not supported. Using an address override prefix in 64-bit mode may result in model-specific execution behavior. (Vol. 2A 3-7) Since 0x66 is an operand override prefix we should be OK (although we may want to warn about 0x67 prefixes to 0xe8). On the CPUs I tested with, they all ignore the 0x66 prefix in 64-bit mode. Patch by Matthew Barney! Differential Revision: http://reviews.llvm.org/D9573 llvm-svn: 246038
* [AArch64] Remove a use-after-free when collecting stats.Chad Rosier2015-08-261-4/+4
| | | | | | | The call to mergePairedInsns() deletes MI, so the later use by isUnscaledLdSt() is referencing freed memory. llvm-svn: 246033
* [AArch64] Unify the integer min/max vector selection patterns with the ↵Silviu Baranga2015-08-262-52/+16
| | | | | | | | | | | | | | | | | | | | intrinsic ones Summary: This change lowers the aarch64 integer vector min/max intrinsic nodes to generic min/max nodes and replaces the intrinsic selection patterns with the generic ones. There should already be testing in place for this, so no further tests were added. Reviewers: jmolloy Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D12276 llvm-svn: 246030
* [SROA] Rip out all support for SSAUpdater in SROA.Chandler Carruth2015-08-261-198/+8
| | | | | | | | | | | | | This was only added to preserve the old ScalarRepl's use of SSAUpdater which was originally to avoid use of dominance frontiers. Now, we only need a domtree, and we'll need a domtree right after this pass as well and so it makes perfect sense to always and only use the dom-tree powered mem2reg. This was flag-flipper earlier and has stuck reasonably so I wanted to gut the now-dead code out of SROA before we waste more time with it. Among other things, this will make passmanager porting easier. llvm-svn: 246028
* Modernize with range-based for loops.Alex Rosenberg2015-08-261-21/+15
| | | | llvm-svn: 246018
* Reduce code duplication.Alex Rosenberg2015-08-261-12/+23
| | | | llvm-svn: 246017
* Trailing whitespaceAlex Rosenberg2015-08-261-1/+1
| | | | llvm-svn: 246016
* [MC] Split the layout part of MCAssembler::finish() into its own method. NFC.Frederic Riss2015-08-261-6/+9
| | | | | | | | | | Split a MCAssembler::layout() method out of MCAssembler::finish(). This allows running the MCSections layout separately from the streaming of the output file. This way if a client wants to use MC to generate section contents, but emit something different than the standard relocatable object files it is possible (llvm-dsymutil is such a client). llvm-svn: 246008
* [MC/MachO] Make some MachObjectWriter methods more generic. NFC.Frederic Riss2015-08-261-25/+30
| | | | | | | Hardcode less values in some mach-o header writing routines and pass them as argument. Doing so will allow reusing this code in llvm-dsymutil. llvm-svn: 246007
* Comparing operands should not require the same ValueIDJF Bastien2015-08-261-5/+2
| | | | | | | | | | | | | Summary: When comparing basic blocks, there is an additional check that two Value*'s should have the same ID, which interferes with merging equivalent constants of different kinds (such as a ConstantInt and a ConstantPointerNull in the included testcase). The cmpValues function already ensures that the two values in each function are the same, so removing this check should not cause incorrect merging. Also, the type comparison is redundant, based on reviewing the code and testing on the test suite and several large LTO bitcodes. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D12302 llvm-svn: 246001
* Expose more properties of llvm::fltSemanticsJF Bastien2015-08-261-0/+15
| | | | | | | | | | | Summary: Adds accessor functions for all the fields in llvm::fltSemantics. This will be used in MergeFunctions to order two APFloats with different semanatics. Author: jrkoenig Reviewers: jfb Subscribers: dschuff, llvm-commits Differential revision: http://reviews.llvm.org/D12253 llvm-svn: 245999
* FastISel: Use finishCondBranch() for ARM,Mips,PowerPC FastISelMatthias Braun2015-08-263-10/+5
| | | | | | Note that after this change branch probabilities are preserved now. llvm-svn: 245998
* FastISel: Factor out common code; NFC intendedMatthias Braun2015-08-263-69/+22
| | | | | | | | | This should be no functional change but for the record: For three cases in X86FastISel this will change the order in which the FalseMBB and TrueMBB of a conditional branch is addedd to the successor/predecessor lists. llvm-svn: 245997
* WebAssembly: add small FIXME for AsmPrinter.JF Bastien2015-08-261-0/+1
| | | | | | Suggested by @sunfish as a follow-up to r245982. llvm-svn: 245996
* Make variable argument intrinsics behave correctly in a Win64 CC function.Charles Davis2015-08-254-58/+77
| | | | | | | | | | | | | | | | Summary: This change makes the variable argument intrinsics, `llvm.va_start` and `llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch I have to add support for GCC's `__builtin_ms_va_list` constructs. Reviewers: nadav, asl, eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1622 llvm-svn: 245990
* WebAssembly: assert that there aren't any constant poolsJF Bastien2015-08-251-0/+7
| | | | | | WebAssembly will either use globals or immediates, since it's a virtual ISA. llvm-svn: 245989
OpenPOWER on IntegriCloud