summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* fix typos; NFCSanjay Patel2016-01-061-7/+6
| | | | llvm-svn: 256881
* [libFuzzer] make trace-based fuzzing not crash in presence of threadsKostya Serebryany2016-01-064-6/+46
| | | | llvm-svn: 256876
* [Statepoints] Check for the "gc-leaf-function" attribute on call sites as well.Manuel Jacob2016-01-051-2/+2
| | | | | | | | | | Reviewers: sanjoy, reames Subscribers: sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15900 llvm-svn: 256875
* [LibCallSimplfier] use instruction-level fast-math-flags for fmin/fmax ↵Sanjay Patel2016-01-051-4/+2
| | | | | | transforms llvm-svn: 256871
* AMDGPU/SI: Do not move scratch resource register on Tonga & IcelandNicolai Haehnle2016-01-051-42/+44
| | | | | | | | | | | | | Due to the SGPR init bug, every program claims to use the same number of SGPRs anyway, so there's no point in trying to shift those registers down from their initial spot of reservation. Add a test that uses VGPR spilling and blocks most SGPRs from being used for the scratch resource register. Previously, this would run into an assertion. Differential Revision: http://reviews.llvm.org/D15724 llvm-svn: 256870
* Implement load to store => memcpy in MemCpyOpt for aggregatesAmaury Sechet2016-01-051-11/+73
| | | | | | | | | | | | | | | Summary: Most of the tool chain is able to optimize scalar and memcpy like operation effisciently while it isn't that good with aggregates. In order to improve the support of aggregate, we try to change aggregate manipulation into either scalar or memcpy like ones whenever possible without loosing informations. This is one such opportunity. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15894 llvm-svn: 256868
* [Clang/Support/Windows/Unix] Command lines created by clang may exceed the ↵Oleg Ranevskyy2016-01-052-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | command length limit set by the OS Summary: Hi Rafael, Would you be able to review this patch, please? (Clang part of the patch is D15832). When clang runs an external tool, e.g. a linker, it may create a command line that exceeds the length limit. Clang uses the llvm::sys::argumentsFitWithinSystemLimits function to check if command line length fits the OS limitation. There are two problems in this function that may cause exceeding of the limit: 1. It ignores the length of the program path in its calculations. On the other hand, clang adds the program path to the command line when it runs the program. 2. It assumes no space character is inserted after the last argument, which is not true for Windows. The flattenArgs function adds the trailing space for *each* argument. The result of this is that the terminating NULL character is not counted and may be placed beyond the length limit if the command line is exactly 32768 characters long. The WinAPI's CreateProcess does not find the NULL character and fails. Reviewers: rafael, ygao, probinson Subscribers: asl, llvm-commits Differential Revision: http://reviews.llvm.org/D15831 llvm-svn: 256866
* [InstCombine] insert a new shuffle before its uses (PR26015)Sanjay Patel2016-01-051-8/+21
| | | | | | | | | | | | | | | | Although this solves the test case in PR26015: https://llvm.org/bugs/show_bug.cgi?id=26015 And may solve PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 ...I suspect this is not the best solution. I think we want to insert the new shuffle just ahead of the earliest ExtractElementInst that we're replacing, but I don't know how that should be implemented. Differential Revision: http://reviews.llvm.org/D15878 llvm-svn: 256857
* Add function for testing string attributes to InvokeInst and CallSite. NFC.Manuel Jacob2016-01-051-14/+0
| | | | llvm-svn: 256856
* [X86] Determine if we have an OpaqueSPAdjustment earlierDavid Majnemer2016-01-051-4/+12
| | | | | | | | | | | | We queried hasFP before we hit ExpandISelPseudos. ExpandISelPseudos manipulated state that hasFP relied on, potentially changing the result after it has been queried elsewhere. While I am not aware of any particular bug due to this state of affairs, it seems best to avoid it entirely by changing the state during DAG construction. llvm-svn: 256849
* [AVX512] add PSLLD and PSLLQ IntrinsicMichael Zuckerman2016-01-051-0/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D15885 llvm-svn: 256840
* [MISched] Explanatory error message when machine model is not complete. NFCMinSeong Kim2016-01-051-1/+1
| | | | | | | | | When not all instructions have a scheduling class, the error message now provides a possible solution. Differential Revision: http://reviews.llvm.org/D15854 llvm-svn: 256839
* [AArch64] Add support for Samsung Exynos-M1MinSeong Kim2016-01-054-2/+28
| | | | | | | | Adds core tuning support for new Samsung Exynos-M1 core (ARMv8-A). Differential Revision: http://reviews.llvm.org/D15663 llvm-svn: 256828
* (NFC) Change SubtargetFeatures::ToggleFeature andArtyom Skrobov2016-01-052-16/+8
| | | | | | | | | | | SubtargetFeatures::ApplyFeatureFlag to be static, so that MCSubtargetInfo doesn't need to instantiate SubtargetFeatures for nothing. Also change the return type to void, as it wasn't ever used. This is a partial commit of http://reviews.llvm.org/D15746 llvm-svn: 256823
* Remove extra whitespace. NFC.Junmo Park2016-01-051-2/+2
| | | | llvm-svn: 256820
* [X86][SSE] Merge PerformBLENDICombine into PerformShuffleCombineSimon Pilgrim2016-01-051-29/+26
| | | | | | PBLEND/BLENDPD/BLENDPS are no different to the other target shuffles and this will make future improvements to the target shuffle combines more straightforward. llvm-svn: 256819
* [X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. ↵Craig Topper2016-01-053-9/+7
| | | | | | It was only needed for rematerialization. llvm-svn: 256818
* [X86] Add OpSize32 to OR32mrLocked instruction to match the normal OR32mr ↵Craig Topper2016-01-051-2/+2
| | | | | | instruction. llvm-svn: 256817
* [AVX512] Add hasSideEffects=0 to kunpck instructions since they lack a ↵Craig Topper2016-01-051-0/+1
| | | | | | pattern in their instructions. llvm-svn: 256816
* [SimplifyCFG] Further improve our ability to remove redundant catchpadsDavid Majnemer2016-01-051-2/+26
| | | | | | | | | In r256814, we managed to remove catchpads which were trivially redudant because they were the same SSA value. We can do better using the same algorithm but with a smarter datastructure by hashing the SSA values within the catchpad and comparing them structurally. llvm-svn: 256815
* [SimplifyCFG] Remove redundant catchpadsDavid Majnemer2016-01-051-2/+17
| | | | | | Remove duplicate catchpad handlers from a catchswitch. llvm-svn: 256814
* AMDGPU: Remove redundant let mayLoad = 1Matt Arsenault2016-01-051-4/+0
| | | | | | This is already set on the SMRD format class. llvm-svn: 256813
* [RS4GC] Simplify handling of Constants in findBaseDefiningValue(). NFC.Manuel Jacob2016-01-051-22/+7
| | | | | | | | | | | | | | | Summary: Previously there were three conditionals, checking for global variables, undef values and everything constant except these two, all three returning the same value. This commit replaces them by one conditional. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15818 llvm-svn: 256812
* [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.Manuel Jacob2016-01-0510-104/+65
| | | | | | | | | | | | | | | Summary: This commit renames GCRelocateOperands to GCRelocateInst and makes it an intrinsic wrapper, similar to e.g. MemCpyInst. Also, all users of GCRelocateOperands were changed to use the new intrinsic wrapper instead. Reviewers: sanjoy, reames Subscribers: reames, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15762 llvm-svn: 256811
* AMDGPU/SI: Select non-uniform constant addrspace loads to flat instructions ↵Tom Stellard2016-01-051-1/+2
| | | | | | | | | | | | | | for HSA Summary: This fixes a regression caused by r256282. Reviewers: arsenm, cfang Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15736 llvm-svn: 256810
* [WinEH] Simplify unreachable catchpadsJoseph Tremoulet2016-01-052-13/+64
| | | | | | | | | | | | | | | | Summary: At least for CoreCLR, a catchpad which immediately executes an `unreachable` instruction indicates that the exception can never have a matching type, and so such catchpads can be removed, and so can their catchswitches if the catchswitch becomes empty. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15846 llvm-svn: 256809
* Revert "[X86] Use push-pop for materializing small constants under 'minsize'"David Majnemer2016-01-055-76/+4
| | | | | | | | | | | | The red zone consists of 128 bytes beyond the stack pointer so that the allocation of objects in leaf functions doesn't require decrementing rsp. In r255656, we introduced an optimization that would cheaply materialize certain constants via push/pop. Push decrements the stack pointer and stores it's result at what is now the top of the stack. However, this means that using push/pop would encroach on the red zone. PR26023 gives an example where this corrupts an object in the red zone. llvm-svn: 256808
* AMDGPU/SI: Consolidate FLAT patternsTom Stellard2016-01-053-90/+42
| | | | | | | | | | | | | | | | | | | | | | | Summary: We had to sets of identical FLAT patterns one inside the HasFlatAddressSpace predicate and one inside the useFlatForGloabl predicate. This patch merges these sets into a single pattern under the isCIVI predicate. The reason we can remove the predicates is that when MUBUF instructions are legal, the instruction selector will prefer selecting those over FLAT instructions because MUBUF patterns have a higher complexity score. So, in this case having patterns for FLAT instructions will have no effect. This change also simplifies the process for forcing global address space loads to use FLAT instructions, since we no only have to disable the MUBUF patterns instead of having to disable the MUBUF patterns and enable the FLAT patterns. Reviewers: arsenm, cfang Subscribers: llvm-commits llvm-svn: 256807
* [MDA] Don't be quite as conservative for noalias functionsPhilip Reames2016-01-051-7/+7
| | | | | | | | | | If we encounter a noalias call that alias analysis can't analyse, we can fall down into the generic call handling rather than giving up entirely. I noticed this while reading through the code for another purpose. I can't seem to write a test case which changes; that sorta makes sense given any test case would have to be an inconsistency in AA. Suggestions welcome. Differential Revision: http://reviews.llvm.org/D15825 llvm-svn: 256802
* MachineInstrBundle: Fix reversed isSuperRegisterEq() callMatthias Braun2016-01-052-1/+5
| | | | | | | | | | | | | | | Unfortunately this fix had the effect of exposing the -verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for which I disabled it for now. Two testcases also have additional pushq/popq where the corrected code cannot prove that %rax is dead any longer. Looking at the examples, this could potentially be fixed by improving computeRegisterLiveness() to check the live-in lists of the successors blocks when reaching the end of a block. This fixes http://llvm.org/PR25951. llvm-svn: 256799
* AMDGPU: add +xnack featureNicolai Haehnle2016-01-045-11/+46
| | | | | | | | | | | | | | | | | | | Summary: Enabling this feature will account for the two SGPRs used by the hardware to store the XNACK_MASK physically. The hardware only requires this reservation when the XNACK feature is explicitly enabled. At some point, HSA will probably want to do that, but it does increase SGPR register pressure, so leave it disabled by default for now (but do add a small test). Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15869 llvm-svn: 256794
* [InstructionCombining] prepareICWorklistFromFunction halts in infinite loop ↵Chen Li2016-01-041-3/+2
| | | | | | | | | | | | | | with instructions of token type Summary: This patch fixes a bug in prepareICWorklistFromFunction, where the loop becomes infinite with instructions of token type. The patch checks if the instruction is token type, and if so it updates EndInst with the current instruction. Reviewers: reames, majnemer Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15859 llvm-svn: 256792
* Clarify that the bypassSlowDivision optimization operates on a single BB [v2]Eric Christopher2016-01-042-58/+52
| | | | | | | | | | | | | | | | Update some comments to be more explicit. Change bypassSlowDivision and the functions it calls so that they take BasicBlock*s and Instruction*s, rather than Function::iterator&s and BasicBlock::iterator&s. Change the APIs so that the caller is responsible for updating the iterator, rather than the callee. This makes control flow much easier to follow. Patch by Justin Lebar! llvm-svn: 256789
* [LICM] Fix a small oversight introduced in r256763David Majnemer2016-01-041-6/+6
| | | | | | | | | | | r256763 had promoteLoopAccessesToScalars check for the existence of a catchswitch when the exit blocks were populated but promoteLoopAccessesToScalars may be called with a prepopulated set of exit blocks which would also need to be checked. This fixes PR26019. llvm-svn: 256788
* [MemoryBuiltins] Remove isOperatorNewLike by consolidating non-null ↵Philip Reames2016-01-043-12/+28
| | | | | | | | | | | | inference handling This patch removes the isOperatorNewLike predicate since it was only being used to establish a non-null return value and we have attributes specifically for that purpose with generic handling. To keep approximate the same behaviour for existing frontends, I added the various operator new like (i.e. instances of operator new) to InferFunctionAttrs. It's not really clear to me why this isn't handled in Clang, but I didn't want to break existing code and any subtle assumptions it might have. Once this patch is in, I'm going to start separating the isAllocLike family of predicates. These appear to be being used for a mixture of things which should be more clearly separated and documented. Today, they're being used to indicate (at least) aliasing facts, CSE-ability, and default values from an allocation site. Differential Revision: http://reviews.llvm.org/D15820 llvm-svn: 256787
* [PGO] Simplify string parsingXinliang David Li2016-01-041-13/+3
| | | | | | Patch Suggested by Vedant. llvm-svn: 256785
* [PGO] Refactor string writer codeXinliang David Li2016-01-041-12/+18
| | | | | | | For readability and code sharing. (Adapted from Suggestions by Vedant). llvm-svn: 256784
* [LIR] General refactoring to simplify code and the ease future code reviewHaicheng Wu2016-01-041-62/+120
| | | | | | | | | | This is a resubmission of r256336 which was reverted in r256361. The issue was the lack of the invariant check of the memset value in processLooMemSet(). The original message: Move several checks into isLegalStores. Also, delineate between those stores that are memset-able and those that are memcpy-able. llvm-svn: 256783
* [X86][SSE] Ensure BLENDPD/BLENDPS/PBLEND inputs are both of the correct ↵Simon Pilgrim2016-01-041-0/+3
| | | | | | input type llvm-svn: 256782
* [PGO]: Use efficient 'join' API for uncompressed stringXinliang David Li2016-01-041-13/+5
| | | | llvm-svn: 256781
* [PGO]: reserve space for string to avoid excessive memory realloc/copy (non ↵Xinliang David Li2016-01-041-0/+6
| | | | | | linear) llvm-svn: 256776
* AMDGPU/SI: Move VI SMEM pattern back into VIInstructions.tdTom Stellard2016-01-042-6/+9
| | | | | | | | | | | | Summary: This was accidently moved to CIInstructions.td in r256282 Reviewers: cfang, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15763 llvm-svn: 256775
* Remove dead instructions before RedoingAditya Nandakumar2016-01-041-1/+33
| | | | | | | | | Before reevaluating instructions, iterate over all instructions to be reevaluated and remove trivially dead instructions and if any of it's operands become trivially dead, mark it for deletion until all trivially dead instructions have been removed llvm-svn: 256773
* [AArch64] Optimize some simple TBZ/TBNZ cases.Geoff Berry2016-01-041-0/+100
| | | | | | | | | | | | | | | | | | Summary: Add some AArch64 dag combines to optimize some simple TBZ/TBNZ cases: (tbz (and x, m), b) -> (tbz x, b) (tbz (shl x, c), b) -> (tbz x, b-c) (tbz (shr x, c), b) -> (tbz x, b+c) (tbz (xor x, -1), b) -> (tbnz x, b) Reviewers: jmolloy, mcrosier, t.p.northover Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D15702 llvm-svn: 256765
* Clang-format my previous change (r256313)Paul Robinson2016-01-041-5/+5
| | | | llvm-svn: 256764
* [LICM] Don't insert instructions after a catchswitch when performing loop ↵David Majnemer2016-01-041-9/+15
| | | | | | | | | promotion Inserting after a catchswitch results in verifier errors, bail out on promotion if a catchswitch is a loop exit. llvm-svn: 256763
* Fix comment in typo. NFCNick Lewycky2016-01-041-1/+1
| | | | llvm-svn: 256761
* [WinEH] Update CoreCLR EH state numberingJoseph Tremoulet2016-01-042-82/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fix the CLR state numbering to generate correct tables, and update the lit test to verify them. The CLR numbering assigns one state number to each catchpad and cleanuppad. It also computes two tree-like relations over states: 1) Each state has a "HandlerParentState", which is the state of the next outer handler enclosing this state's handler (same as nearest ancestor per the ParentPad linkage on EH pads, but skipping over catchswitches). 2) Each state has a "TryParentState", which: a) for a catchpad that's not the last handler on its catchswitch, is the state of the next catchpad on that catchswitch. b) for all other pads, is the state of the pad whose try region is the next outer try region enclosing this state's try region. The "try regions are not present as such in the IR, but will be inferred based on the placement of invokes and pads which reach each other by exceptional exits. Catchswitches do not get their own states, but each gets mapped to the state of its first catchpad. Table generation requires each state's "unwind dest" state to have a lower state number than the given state. Since HandlerParentState can be computed as a function of a pad's ParentPad, and TryParentState can be computed as a function of its unwind dest and the TryParentStates of its children, the CLR state numbering algorithm first computes HandlerParentState in a top-down pass, then computes TryParentState in a bottom-up pass. Also reword some comments/names in the CLR EH table generation to make the distinction between the different kinds of "parent" clear. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: AndyAyers, llvm-commits Differential Revision: http://reviews.llvm.org/D15325 llvm-svn: 256760
* AMDGPU: Avoid assertions after SGPR spilling failedNicolai Haehnle2016-01-042-10/+11
| | | | | | | | | | | | | | | | | | | | Summary: The comment explains it: emitError does not necessarily exit the compilation process, and then using NoRegister leads to assertions later on. This generates incorrect code, of course, but the user should know to not use the result when an error has been emitted. It would be nice to have a test-case for this inside the LLVM repository, but llc exits on error. shader-db tests trigger the underlying issue at least on Tonga. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15826 llvm-svn: 256757
* [AVX512] add PSRAD and PSRAQ IntrinsicMichael Zuckerman2016-01-041-0/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D15851 llvm-svn: 256754
OpenPOWER on IntegriCloud