summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SimplifyLibCalls] sprintf doesn't copy null bytesDavid Majnemer2016-04-261-3/+4
| | | | | | | | | | sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580
* Swift Calling Convention: use %RAX for sret.Manman Ren2016-04-263-1/+16
| | | | | | | We don't need to copy the sret argument into %rax upon return. rdar://25671494 llvm-svn: 267579
* [AMDGPU] Move reserved vgpr count for trap handler usage to ↵Konstantin Zhuravlyov2016-04-266-9/+20
| | | | | | | | SIMachineFunctionInfo + minor commenting changes Differential Revision: http://reviews.llvm.org/D19537 llvm-svn: 267573
* [CodeGenPrepare] use branch weight metadata to decide if a select should be ↵Sanjay Patel2016-04-264-33/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | turned into a branch This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344 CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place. For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly. As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable. Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435. Differential Revision: http://reviews.llvm.org/D19488 llvm-svn: 267572
* Fix build broken due to order of initialization problem.Zachary Turner2016-04-261-2/+1
| | | | llvm-svn: 267571
* Refactor some more PDB reading code into DebugInfoPDB.Zachary Turner2016-04-263-0/+164
| | | | | | | Differential Revision: http://reviews.llvm.org/D19445 Reviewed By: David Majnemer llvm-svn: 267564
* [AMDGPU] Reserve VGPRs for trap handler usage if instructedKonstantin Zhuravlyov2016-04-266-1/+48
| | | | | | Differential Revision: http://reviews.llvm.org/D19235 llvm-svn: 267563
* Use gcc's rules for parsing gcc-style response filesNico Weber2016-04-261-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | In gcc, \ escapes every character in response files. It is true that this makes it harder to mention Windows files in rsp files, but not doing this means clang disagrees with gcc, and also disagrees with the shell (on non-Windows) which rsp file quoting is supposed to match. clang isn't free to choose what to do here. In general, the idea for response files is to take bits of your command line and write them to a file unchanged, and have things work the same way. Since the command line would've been interpreted by the shell, things in the rsp file need to be subject to the same shell quoting rules. People who want to put Windows-style paths in their response files either need to do any of: * escape their backslashes * or use clang-cl which uses cl.exe/cmd.exe quoting rules * pass --rsp-quoting=windows to clang to tell it to use cl.exe/cmd.exe quoting rules for response files. Fixes PR27464. http://reviews.llvm.org/D19417 llvm-svn: 267556
* [AMDGPU] Assembler: basic support for SDWA instructionsSam Kolton2016-04-268-58/+414
| | | | | | | | | | | | | | | Support for SDWA instructions for VOP1 and VOP2 encoding. Not done yet: - converters for support optional operands and modifiers - VOPC - sext() modifier - intrinsics - VOP2b (see vop_dpp.s) - V_MAC_F32 (see vop_dpp.s) Differential Revision: http://reviews.llvm.org/D19360 llvm-svn: 267553
* [X86] PR27502: Fix the LEA optimization pass.Andrey Turetskiy2016-04-261-2/+6
| | | | | | | | Handle MachineBasicBlock as a memory displacement operand in the LEA optimization pass. Differential Revision: http://reviews.llvm.org/D19409 llvm-svn: 267551
* [Sparc] Fix build error introduced by rL267545.Marcin Koscielnicki2016-04-261-1/+1
| | | | llvm-svn: 267549
* [PowerPC] Add support for llvm.thread.pointerMarcin Koscielnicki2016-04-261-0/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D19304 llvm-svn: 267546
* [SPARC] [SSP] Add support for LOAD_STACK_GUARD.Marcin Koscielnicki2016-04-266-1/+40
| | | | | | | | This fixes PR22248 on sparc. Differential Revision: http://reviews.llvm.org/D19386 llvm-svn: 267545
* [SPARC] Add support for llvm.thread.pointer.Marcin Koscielnicki2016-04-262-0/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D19387 llvm-svn: 267544
* ThinLTOCodeGenerator: preserve linkonce when in "MustPreserved" setMehdi Amini2016-04-261-4/+10
| | | | | | | | | If the linker specifically requested for a linkonce to be preserved, we need to make sure we won't drop it even if all the uses in the current module disappear. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267543
* Revert "ARM: put correct symbol index on indirect pointers in __thread_ptr."Renato Golin2016-04-261-2/+1
| | | | | | This reverts commit r267488, as it broke some ARM buildbots. llvm-svn: 267541
* [ppc64] Reenable sibling call optimization on ppc64 since fixed tsan library ↵Chuang-Yu Cheng2016-04-261-1/+1
| | | | | | | | | | | tail-call issue print-stack-trace.cc test failure of compiler-rt has been fixed by r266869 (http://reviews.llvm.org/D19148), so reenable sibling call optimization on ppc64 Reviewers: nemanjai kbarton llvm-svn: 267527
* [AArch64] Expand v1i64 and v2i64 ctlz.Craig Topper2016-04-261-0/+3
| | | | | | The default is legal, which results in 'Cannot select' errors. llvm-svn: 267522
* [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.Craig Topper2016-04-261-0/+10
| | | | | | The default is Legal, which results in 'Cannot select' errors. llvm-svn: 267521
* [ARM] Expand v1i64 and v2i64 ctlz.Craig Topper2016-04-261-0/+3
| | | | | | The default is legal, which results in 'Cannot select' errors. llvm-svn: 267520
* Tune basic block annotation algorithm.Dehao Chen2016-04-261-9/+12
| | | | | | | | | | | | | | | Summary: Instead of using maximum IR weight as the basic block weight, this patch uses the voting algorithm to find the most likely weight for the basic block. This can effectively avoid the cases when some IRs are annotated incorrectly due to code motion of the profiled binary. This patch also updates propagate.ll unittest to include discriminator in the input file so that it is testing something meaningful. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19301 llvm-svn: 267519
* [SimplifyCFG] Preserve !llvm.mem.parallel_loop_access when mergingHal Finkel2016-04-262-1/+3
| | | | | | | | When SimplifyCFG merges identical instructions from both sides of a diamond, it can preserve !llvm.mem.parallel_loop_access (as it does with most of the other metadata). There's no real data or control dependency change in this case. llvm-svn: 267515
* [LoopVectorize] Don't consider conditional-load dereferenceability for ↵Hal Finkel2016-04-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | marked parallel loops I really thought we were doing this already, but we were not. Given this input: void Test(int *res, int *c, int *d, int *p) { for (int i = 0; i < 16; i++) res[i] = (p[i] == 0) ? res[i] : res[i] + d[i]; } we did not vectorize the loop. Even with "assume_safety" the check that we don't if-convert conditionally-executed loads (to protect against data-dependent deferenceability) was not elided. One subtlety: As implemented, it will still prefer to use a masked-load instrinsic (given target support) over the speculated load. The choice here seems architecture specific; the best option depends on how expensive the masked load is compared to a regular load. Ideally, using the masked load still reduces unnecessary memory traffic, and so should be preferred. If we'd rather do it the other way, flipping the order of the checks is easy. The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also implies that if conversion is okay. Differential Revision: http://reviews.llvm.org/D19512 llvm-svn: 267514
* [WebAssembly] Account for implicit operands when computing operand indices.Dan Gohman2016-04-261-1/+1
| | | | llvm-svn: 267511
* [SROA] Don't falsely report that changes have occuredDavid Majnemer2016-04-261-6/+10
| | | | | | | | | We would report that the function changed despite creating no new allocas or performing any promotion. This fixes PR27316. llvm-svn: 267507
* Reverting Thumb2SizeReduction opt bisect change to fix failing buildbots.Andrew Kaylor2016-04-261-2/+1
| | | | llvm-svn: 267506
* [CodeGenPrepare] don't convert an unpredictable select into control flowSanjay Patel2016-04-261-1/+2
| | | | | | | Suggested in the review of D19488: http://reviews.llvm.org/D19488 llvm-svn: 267504
* Remove MinLatency in SchedMachineModel. NFC.Junmo Park2016-04-2611-14/+0
| | | | | | | | | | | Summary: We don't use MinLatency any more since r184032. Reviewers: atrick, hfinkel, mcrosier Differential Revision: http://reviews.llvm.org/D19474 llvm-svn: 267502
* PM: Port GlobalOpt to the new pass managerJustin Bogner2016-04-265-40/+61
| | | | llvm-svn: 267499
* PM: Convert the logic for GlobalOpt into static functions. NFCJustin Bogner2016-04-261-66/+71
| | | | | | | | | | Pass all of the state we need around as arguments, so that these functions are easier to reuse. There is one part of this that is unusual: we pass around a functor to look up a DomTree for a function. This will be a necessary abstraction when we try to use this code in both the legacy and the new pass manager. llvm-svn: 267498
* [X86] Use LivePhysRegs in X86FixupBWInsts.Ahmed Bougacha2016-04-261-13/+19
| | | | | | | | | Kill-flags, which computeRegisterLiveness uses, are not reliable. LivePhysRegs is. Differential Revision: http://reviews.llvm.org/D19472 llvm-svn: 267495
* Add check for "branch_weights" with prof metadataSanjay Patel2016-04-251-3/+7
| | | | | | | While we're here, fix the comment and variable names to make it clear that these are raw weights, not percentages. llvm-svn: 267491
* [Sparc] Fix double-float fabs and fneg on little endian CPUs.James Y Knight2016-04-251-12/+28
| | | | | | | | | | | | | | | | The SparcV8 fneg and fabs instructions interestingly come only in a single-float variant. Since the sign bit is always the topmost bit no matter what size float it is, you simply operate on the high subregister, as if it were a single float. However, the layout of double-floats in the float registers is reversed on little-endian CPUs, so that the high bits are in the second subregister, rather than the first. Thus, this expansion must check the endianness to use the correct subregister. llvm-svn: 267489
* ARM: put correct symbol index on indirect pointers in __thread_ptr.Tim Northover2016-04-251-1/+2
| | | | | | Otherwise the linker has no idea what should be resolved. llvm-svn: 267488
* Fix build warningAndrew Kaylor2016-04-251-1/+1
| | | | llvm-svn: 267487
* Add optimization bisect opt-in calls for AMDGPU passesAndrew Kaylor2016-04-257-1/+19
| | | | | | Differential Revision: http://reviews.llvm.org/D19450 llvm-svn: 267485
* Reformat LLVMConstPointerNull. NFCAmaury Sechet2016-04-251-2/+1
| | | | llvm-svn: 267484
* Optimize store of "bitcast" from vector to aggregate.Arch D. Robison2016-04-251-0/+60
| | | | | | | | | | | This patch is what was the "instcombine" portion of D14185, with an additional test added (see julia_pseudovec in test/Transforms/InstCombine/insert-val-extract-elem.ll). The patch causes instcombine to replace sequences of extractelement-insertvalue-store that act essentially like a bitcast followed by a store. Differential review: http://reviews.llvm.org/D14260 llvm-svn: 267482
* [LVI] Make a precondition explicit rather than handling a case which never ↵Philip Reames2016-04-251-1/+2
| | | | | | happens [NFC] llvm-svn: 267481
* Add optimization bisect opt-in calls for ARM passesAndrew Kaylor2016-04-255-2/+15
| | | | | | Differential Revision: http://reviews.llvm.org/D19449 llvm-svn: 267480
* Add optimization bisect opt-in calls for AArch64 passesAndrew Kaylor2016-04-2512-0/+34
| | | | | | Differential Revision: http://reviews.llvm.org/D19394 llvm-svn: 267479
* Add accidentally deleted "break"Krzysztof Parzyszek2016-04-251-0/+1
| | | | llvm-svn: 267476
* [ORC] clang-format code that was touched in r267457. NFC.Lang Hames2016-04-255-213/+189
| | | | | | | Commit r267457 made a lot of type-substitutions threw off code formatting and alignment. This patch should tidy those changes up. llvm-svn: 267475
* ARM: put extern __thread stubs in a special section.Tim Northover2016-04-255-2/+33
| | | | | | | The linker needs to know that the symbols are thread-local to do its job properly. llvm-svn: 267473
* [ThinLTO] Introduce typedef for commonly-used map type (NFC)Teresa Johnson2016-04-253-22/+12
| | | | | | | | Add a typedef for the std::map<GlobalValue::GUID, GlobalValueSummary *> map that is passed around to identify summaries for values defined in a particular module. This shortens up declarations in a variety of places. llvm-svn: 267471
* [Hexagon] Few fixes for exception handlingKrzysztof Parzyszek2016-04-253-1/+13
| | | | llvm-svn: 267469
* Re-apply r267206 with a fix for the encoding problem: when the immediate ofQuentin Colombet2016-04-251-3/+14
| | | | | | | | | | | | | | | | | | log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit variant cannot encode it. Therefore, set the subreg part accordingly. [AArch64] Fix optimizeCondBranch logic. The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267465
* Cleanup redundant expression in InstCombineAndOrXor.Etienne Bergeron2016-04-251-2/+0
| | | | | | | | | | | | | | | Summary: The expression is redundant on both side of operator |. detected by : http://reviews.llvm.org/D19451 Reviewers: rnk, majnemer Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D19459 llvm-svn: 267458
* [ORC] Thread Error/Expected through the RPC library.Lang Hames2016-04-254-26/+55
| | | | | | | | | | This replaces use of std::error_code and ErrorOr in the ORC RPC support library with Error and Expected. This required updating the OrcRemoteTarget API, Client, and server code, as well as updating the Orc C API. This patch also fixes several instances where Errors were dropped. llvm-svn: 267457
* AMDGPU/SI: Optimize adjacent s_nop instructionsMatt Arsenault2016-04-251-0/+27
| | | | | | | | | | | | Use the operand for how long to wait. This is somewhat distasteful, since it would be better to just emit s_nop with the right argument in the first place. This would require changing TII::insertNoop to emit N operands, which would be easy. Slightly more problematic is the post-RA scheduler and hazard recognizer represent nops as a single null node, and would require inventing another way of representing N nops. llvm-svn: 267456
OpenPOWER on IntegriCloud