summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Refactor debugging code, NFC.Than McIntosh2016-04-271-31/+30
| | | | | | | | | | | | | | | | | Summary: Refactor debugging routines to reduce code duplication. Remove a couple of #include's that were not needed. Don't require MachineDominator as a prereq for this pass (not needed). These changes split off from http://reviews.llvm.org/D18827. Reviewers: wmi, gbiv, qcolombet Subscribers: llvm-commits, davidxl, jevinskie Differential Revision: http://reviews.llvm.org/D18992 llvm-svn: 267766
* [NVPTX] Run NVVMReflect at the beginning of IR passes.Justin Lebar2016-04-272-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the NVVMReflect pass is run at the beginning of our backend passes. But really, it should be run as early as possible, as it's simply resolving an "if" statement in code. So copy it into TargetMachine::addEarlyAsPossiblePasses. We still run it at the beginning of the backend passes, since it's needed for correctness when lowering to nvptx. (Specifically, NVVMReflect changes each call to the __nvvm_reflect function or llvm.nvvm.reflect intrinsic into an integer constant, based on the pass's configuration. Clearly we miss many optimization opportunities if we perform this transformation at the beginning of codegen.) Reviewers: rnk Subscribers: tra, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D18616 llvm-svn: 267765
* [LIR] Set attributes on memset_pattern16.Ahmed Bougacha2016-04-271-0/+2
| | | | | | | | | "inferattrs" will deduce the attribute, but it will be too late for many optimizations. Set it ourselves when creating the call. Differential Revision: http://reviews.llvm.org/D17598 llvm-svn: 267762
* [LIR] Reuse variable. NFCI.Ahmed Bougacha2016-04-271-1/+1
| | | | llvm-svn: 267761
* [InferAttrs] Mark memset_pattern16 params nocapture.Ahmed Bougacha2016-04-271-0/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19471 llvm-svn: 267760
* [TLI] Unify LibFunc attribute inference. NFCI.Ahmed Bougacha2016-04-272-795/+723
| | | | | | | | | | | | | Now the pass is just a tiny wrapper around the util. This lets us reuse the logic elsewhere (done here for BuildLibCalls) instead of duplicating it. The next step is to have something like getOrInsertLibFunc that also sets the attributes. Differential Revision: http://reviews.llvm.org/D19470 llvm-svn: 267759
* [TLI] Unify LibFunc signature checking. NFCI.Ahmed Bougacha2016-04-277-711/+583
| | | | | | | | | I tried to be as close as possible to the strongest check that existed before; cleaning these up properly is left for future work. Differential Revision: http://reviews.llvm.org/D19469 llvm-svn: 267758
* [TLI] Fix indentation. NFC.Ahmed Bougacha2016-04-271-1/+1
| | | | llvm-svn: 267757
* Clean up to avoid compiler warnings for casting away const qualifiers.Sjoerd Meijer2016-04-272-4/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D19598 llvm-svn: 267753
* Revert "[AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for ↵Chad Rosier2016-04-276-39/+13
| | | | | | | | SMRD." This reverts commit r267733 due to a -Werror,-Wunused-function error. llvm-svn: 267752
* [LV] Reallow positive-stride interleaved load groups with gapsMatthew Simpson2016-04-271-9/+47
| | | | | | | | | | | | | | We previously disallowed interleaved load groups that may cause us to speculatively access memory out-of-bounds (r261331). We did this by ensuring each load group had an access corresponding to the first and last member. Instead of bailing out for these interleaved groups, this patch enables us to peel off the last vector iteration, ensuring that we execute at least one iteration of the scalar remainder loop. This solution was proposed in the review of the previous patch. Differential Revision: http://reviews.llvm.org/D19487 llvm-svn: 267751
* [SLPVectorizer] Refactor where MinVecRegSize and MaxVecRegSize live.Arch D. Robison2016-04-271-20/+28
| | | | | | | | | This is the first of two commits for extending SLP Vectorizer to deal with aggregates. This commit merely refactors existing logic. http://reviews.llvm.org/D14185 llvm-svn: 267748
* [DAGCombiner] Follow coding convention for function name (NFC)Gerolf Hoflehner2016-04-273-4/+4
| | | | llvm-svn: 267745
* [Mips] Add support for llvm.thread.pointer intrinsic.Marcin Koscielnicki2016-04-271-0/+4
| | | | | | | | This will be used to implement __builtin_thread_pointer in clang. Differential Revision: http://reviews.llvm.org/D19569 llvm-svn: 267743
* Silence a -Wdangling-elseReid Kleckner2016-04-271-1/+2
| | | | llvm-svn: 267737
* Add parentheses to silence buildbot warningMatthew Simpson2016-04-271-2/+2
| | | | llvm-svn: 267734
* [AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.Artem Tamazov2016-04-276-13/+39
| | | | | | | | | | | Added support of TTMP quads. Reworked M0 exclusion machinery for SMRD and similar instructions to enable usage of TTMP registers in those instructions as destinations. Tests added. Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 267733
* [PDB] Fix function names for private symbols in PDBsReid Kleckner2016-04-271-14/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: llvm-symbolizer wants to get linkage names of functions for historical reasons. Linkage names are only recorded in the PDB for public symbols, and the linkage name is apparently stored separately in some "public symbol" record. We had a workaround in PDBContext which would look for such symbols when the user requested linkage names. However, when given an address that was truly in a private function and public funciton, we would accidentally find nearby public symbols and return those function names. The fix is to look for both function symbols and public symbols and only prefer the public symbol name if the addresses of the symbols agree. Fixes PR27492 Reviewers: zturner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19571 llvm-svn: 267732
* AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsicNicolai Haehnle2016-04-272-14/+78
| | | | | | | | | | | | | | | | | Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 llvm-svn: 267729
* [TTI] Add hook for vector extract with extensionMatthew Simpson2016-04-274-3/+70
| | | | | | | | | | | | | | | This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725
* [AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware ↵Artem Tamazov2016-04-272-13/+42
| | | | | | | | | | | | registers. Possibility to specify code of hardware register kept. Disassemble to symbolic name, if name is known. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19335 llvm-svn: 267724
* Revert r267649, it caused PR27539.Nico Weber2016-04-272-147/+7
| | | | llvm-svn: 267723
* [ThinLTO] Refine fix to avoid renaming of uses in inline assembly.Teresa Johnson2016-04-272-14/+30
| | | | | | | | | | | | | | | | | Summary: Refine the workaround from r266877 that attempts to prevent renaming of locals in inline assembly, so that in addition to looking for a llvm.used local value, that there is at least one inline assembly call in the module. Otherwise, debug functions added to the llvm.used can block importing/exporting unnecessarily. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19573 llvm-svn: 267717
* [ThinLTO] Use valueid instead of bitcode offsets in combined index fileTeresa Johnson2016-04-272-117/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With the removal of support for lazy parsing of combined index summary records (e.g. r267344), we no longer need to include the summary record bitcode offset in the VST entries for definitions. Change the combined index format to be similar to the per-module index format in using value ids to cross-reference from the summary record to the VST entry (rather than the summary record bitcode offset to cross-reference in the other direction). The visible changes are: 1) Add the value id to the combined summary records 2) Remove the summary offset from the combined VST records, which has the following effects: - No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all combined index VST entries now only contain the value id and corresponding GUID. - No longer have duplicate VST entries in the case where there are multiple definitions of a symbol (e.g. weak/linkonce), as they all have the same value id and GUID. An implication of #2 above is that in order to hook up an alias to the correct aliasee based on the value id of the aliasee recorded in the combined index alias record, we need to scan the entries in the index for that GUID to find the one from the same module (i.e. the case where there are multiple entries for the aliasee). But the reader no longer has to maintain a special map to hook up the alias/aliasee. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19481 llvm-svn: 267712
* NFC. Introduce Value::getPointerDerferecnceableBytesArtur Pilipenko2016-04-272-27/+36
| | | | | | | | | | Extract a part of isDereferenceableAndAlignedPointer functionality to Value::getPointerDerferecnceableBytes. Currently it's a NFC, but in future I'm going to accumulate all the logic about value dereferenceability in this function similarly to Value::getPointerAlignment function (D16144). Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17572 llvm-svn: 267708
* [mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU ↵Zlatko Buljan2016-04-275-20/+70
| | | | | | | | instructions Differential Revision: http://reviews.llvm.org/D16676 llvm-svn: 267694
* [mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, ↵Zlatko Buljan2016-04-273-12/+41
| | | | | | | | SRAV, SRL and SRLV instructions Differential Revision: http://reviews.llvm.org/D17989 llvm-svn: 267693
* isSafeToLoadUnconditionally support queries without a contextArtur Pilipenko2016-04-275-12/+23
| | | | | | | | | | This is required to use this function from isSafeToSpeculativelyExecute Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16231 llvm-svn: 267692
* Use DL preferred alignment for alloca in Value::getPointerAlignmentArtur Pilipenko2016-04-271-2/+7
| | | | | | | | | | Teach Value::getPointerAlignment that allocas with no explicit alignment are aligned to preferred alignment of the allocated type. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D17569 llvm-svn: 267689
* [LoopDist] Add llvm.loop.distribute.enable loop metadataAdam Nemet2016-04-272-12/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672
* [Cloning] cloneLoopWithPreheader(): add assert to ensure no sub-loopsVaivaswatha Nagaraj2016-04-271-0/+2
| | | | | | | | | | | | | | Summary: cloneLoopWithPreheader() does not update LoopInfo for sub-loop of the original loop being cloned. Add assert to ensure no sub-loops for loop being cloned. Reviewers: anemet, ashutosh.nema, hfinkel Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D15922 llvm-svn: 267671
* [Support][X86] Add a few more Intel model numbers to getHostCPUName for ↵Craig Topper2016-04-271-0/+4
| | | | | | airmont and knl. llvm-svn: 267670
* [Support][X86] Change the case values in the Intel family 6 code to hex so ↵Craig Topper2016-04-271-68/+66
| | | | | | its easier to compare with Intel's docs. NFC llvm-svn: 267669
* Revert "Support "preserving" the summary information when using setModule() ↵Mehdi Amini2016-04-271-8/+1
| | | | | | | | | | API in LTOCodeGenerator" This reverts commit r267665. ASAN shows that there is a use of undefined value. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267668
* [Support][X86] Add a couple more Broadwell CPU models numbers to getHostCPUName.Craig Topper2016-04-271-0/+2
| | | | llvm-svn: 267666
* Support "preserving" the summary information when using setModule() API in ↵Mehdi Amini2016-04-271-1/+8
| | | | | | | | | LTOCodeGenerator Another attempt at r267655... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267665
* Revert "Support "preserving" the summary information when using setModule() ↵Mehdi Amini2016-04-271-8/+1
| | | | | | | | | | API in LTOCodeGenerator" This reverts commit r267657, r267656, and r267655. The test does not pass on multiple bots, I'm unsure why yet but let's unbreak them. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267664
* The patch fixes PR27392.Evgeny Stupachenko2016-04-271-9/+10
| | | | | | | | | | | | | | | | | Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662
* [LVI] Delete stale and misleading comment.Philip Reames2016-04-271-5/+0
| | | | llvm-svn: 267661
* [ppc64] fix bug in prologue that mfocrf's cr operand should be explict state ↵Chuang-Yu Cheng2016-04-271-4/+10
| | | | | | | | | | | | instead of implicit This fixes PR27414 Reviewers: kbarton mgrang tjablin http://reviews.llvm.org/D19255 llvm-svn: 267660
* [X86] Set AddPristinesAndCSRs to FixupBW LivePhysRegs. NFC.Ahmed Bougacha2016-04-271-1/+2
| | | | | | | | | | | | | We run after PEI, so we need to AddPristinesAndCSRs. In practice, that makes no difference here, because we only ask about liveness of super-registers of defined GR8/GR16 registers, so they can't be pristine. Still, it's the correct thing to do. Thanks to Quentin for noticing! Follow-up to r267495. llvm-svn: 267658
* Support "preserving" the summary information when using setModule() API in ↵Mehdi Amini2016-04-271-1/+8
| | | | | | | LTOCodeGenerator From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267655
* Fix typo in comment; NFCSanjoy Das2016-04-271-1/+1
| | | | llvm-svn: 267653
* [X86] Don't assume that MMX extractelts are from index 0.Ahmed Bougacha2016-04-271-1/+3
| | | | | | | It's probably the case for all 3 MMX users out there, but with hand-crafted IR, you can trigger selection failures. Fix that. llvm-svn: 267652
* [X86] Re-enable MMX i32 extractelt combine.Ahmed Bougacha2016-04-271-10/+3
| | | | | | | | | This effectively adds back the extractelt combine removed by r262358: the direct case can still occur (because x86_mmx is special, see r262446), but it's the indirect case that's now superseded by the generic combine. llvm-svn: 267651
* Detects the SAD pattern on X86 so that much better code will be emitted once ↵Cong Hou2016-04-272-7/+147
| | | | | | | | the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649
* [LVI] Add a comment explaining a subtle piece of codePhilip Reames2016-04-271-15/+23
| | | | | | Or at least, I didn't understand the implications the first several times I read it it. llvm-svn: 267648
* ThinLTO: do not promote GlobalVariable that have a specific section.Mehdi Amini2016-04-272-4/+75
| | | | | | | Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646
* SLSR: Use UnknownAddressSpace instead of 0 for pure arithmetic.Matt Arsenault2016-04-271-1/+3
| | | | | | | | | In the case where isLegalAddressingMode is used for cases not related to addressing modes, such as pure adds and muls, it should not be using address space 0. LSR already passes -1 as the address space in these cases. llvm-svn: 267645
* LTOCodeGenerator: turns linkonce(_odr) into weak_(odr) when present ↵Mehdi Amini2016-04-271-19/+48
| | | | | | | | | | | | | | | | | "MustPreserve" set Summary: If the linker requested to preserve a linkonce function, we should honor this even if we drop all uses. Reviewers: dexonsmith Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19527 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267644
OpenPOWER on IntegriCloud