summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Reorder recently added/improved pow transformationsDavid Bolvansky2019-07-111-3/+3
| | | | | | Changed cases are now faster with exp2. llvm-svn: 365758
* Revert [BitcodeReader] Validate OpNum, before accessing Record array.Florian Hahn2019-07-111-4/+0
| | | | | | | | | | | | This reverts r365750 (git commit 8b222ecf2769ee133691f208f6166ce118c4a164) llvm-dis runs out of memory while opening invalid-fcmp-opnum.bc on llvm-hexagon-elf, probably because the bitcode file contains other suspicious values. http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/21949 llvm-svn: 365757
* [llvm-objcopy] Don't change permissions of non-regular output filesFangrui Song2019-07-112-7/+13
| | | | | | | | | | | | | | | | | | | | | | There is currently an EPERM error when a regular user executes `llvm-objcopy a.o /dev/null`. Worse, root can even change the mode bits of /dev/null. Fix it by checking if the output file is special. A new overload of llvm::sys::fs::setPermissions with FD as the parameter is added. Users should provide `perm & ~umask` as the parameter if they intend to respect umask. The existing overload of llvm::sys::fs::setPermissions may be deleted if we can find an implementation of fchmod() on Windows. fchmod() is usually better than chmod() because it saves syscalls and can avoid race condition. Reviewed By: jakehehrlich, jhenderson Differential Revision: https://reviews.llvm.org/D64236 llvm-svn: 365753
* [X86] -fno-plt: use GOT __tls_get_addr only if GOTPCRELX is enabledFangrui Song2019-07-111-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: As of binutils 2.32, ld has a bogus TLS relaxation error when the GD/LD code sequence using R_X86_64_GOTPCREL (instead of R_X86_64_GOTPCRELX) is attempted to be relaxed to IE/LE (binutils PR24784). gold and lld are good. In gcc/config/i386/i386.md, there is a configure-time check of as/ld support and the GOT relaxation will not be used if as/ld doesn't support it: if (flag_plt || !HAVE_AS_IX86_TLS_GET_ADDR_GOT) return "call\t%P2"; return "call\t{*%p2@GOT(%1)|[DWORD PTR %p2@GOT[%1]]}"; In clang, -DENABLE_X86_RELAX_RELOCATIONS=OFF is the default. The ld.bfd bogus error can be reproduced with: thread_local int a; int main() { return a; } clang -fno-plt -fpic a.cc -fuse-ld=bfd GOTPCRELX gained relative good support in 2016, which is considered relatively new. It is even difficult to conditionally default to -DENABLE_X86_RELAX_RELOCATIONS=ON due to cross compilation reasons. So work around the ld.bfd bug by only using GOT when GOTPCRELX is enabled. Reviewers: dalias, hjl.tools, nikic, rnk Reviewed By: nikic Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64304 llvm-svn: 365752
* [BitcodeReader] Validate OpNum, before accessing Record array.Florian Hahn2019-07-111-0/+4
| | | | | | | | | | | | | | | Currently invalid bitcode files can cause a crash, when OpNum exceeds the number of elements in Record, like in the attached bitcode file. The test case was generated by clusterfuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15698 Reviewers: t.p.northover, thegameg, jfb Reviewed By: jfb Differential Revision: https://reviews.llvm.org/D64507 llvm-svn: 365750
* [ARM][LowOverheadLoops] Correct offset checkingSam Parker2019-07-113-11/+29
| | | | | | | | | | | | | | | | This patch addresses a couple of problems: 1) The maximum supported offset of LE is -4094. 2) The offset of WLS also needs to be checked, this uses a maximum positive offset of 4094. The use of BasicBlockUtils has been changed because the block offsets weren't being initialised, but the isBBInRange checks both positive and negative offsets. ARMISelLowering has been tweaked because the test case presented another pattern that we weren't supporting. llvm-svn: 365749
* [ARM] Remove nonexistent unsigned forms of MVE VQDMLAH.Simon Tatham2019-07-111-3/+0
| | | | | | | | | | | | | | | | | | | | The VQDMLAH.U8, VQDMLAH.U16 and VQDMLAH.U32 instructions don't actually exist: the Armv8.1-M architecture spec only lists signed forms of that instruction. The unsigned ones were added in error: they existed in an early draft of the spec, but they were removed before the public version, and we missed that particular spec change. Also affects the variant forms VQDMLASH, VQRDMLAH and VQRDMLASH. Reviewers: miyuki Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64502 llvm-svn: 365747
* [MIPS GlobalISel] Skip copies in addUseDef and addDefUsesPetar Avramovic2019-07-112-11/+48
| | | | | | | | | | | | Skip copies between virtual registers during search for UseDefs and DefUses. Since each operand has one def search for UseDefs is straightforward. But since operand can have many uses, we have to check all uses of each copy we traverse during search for DefUses. Differential Revision: https://reviews.llvm.org/D64486 llvm-svn: 365744
* [MIPS GlobalISel] RegBankSelect for chains of ambiguous instructionsPetar Avramovic2019-07-112-14/+77
| | | | | | | | | | | | | | | | | | | | When one of the uses/defs of ambiguous instruction is also ambiguous visit it recursively and search its uses/defs for instruction with only one mapping available. When all instruction in a chain are ambiguous arbitrary mapping can be selected. For s64 operands in ambiguous chain fprb is selected since it results in less instructions then having to narrow scalar s64 to s32. For s32 both gprb and fprb result in same number of instructions and gprb is selected like a general purpose option. At the moment we always avoid cross register bank copies. TODO: Implement a model for costs calculations of different mappings on same instruction and cross bank copies. Allow cross bank copies when appropriate according to cost model. Differential Revision: https://reviews.llvm.org/D64485 llvm-svn: 365743
* Revert Recommit "[CommandLine] Remove OptionCategory and SubCommand caches ↵Haojian Wu2019-07-111-53/+59
| | | | | | | | | | from the Option class." This reverts r365675 (git commit 43d75f977853c3ec891a440c362b2df183a211b5) The patch causes a crash in SupportTests (CommandLineTest.AliasesWithArguments). llvm-svn: 365742
* Remove some redundant code from r290372 and improve a comment.Jay Foad2019-07-111-5/+3
| | | | llvm-svn: 365741
* [ARM][ParallelDSP] Change the search for smladsSam Parker2019-07-111-252/+316
| | | | | | | | | | | | | | | | Two functional changes have been made here: - Now search up from any add instruction to find the chains of operations that we may turn into a smlad. This allows the generation of a smlad which doesn't accumulate into a phi. - The search function has been corrected to stop it falsely searching up through an invalid path. The bulk of the changes have been making the Reduction struct a class and making it more C++y with getters and setters. Differential Revision: https://reviews.llvm.org/D61780 llvm-svn: 365740
* [WebAssembly] Print error message for llvm.clear_cache intrinsicHeejin Ahn2019-07-111-0/+4
| | | | | | | | | | | | | | | | Summary: Wasm does not currently support `llvm.clear_cache` intrinsic, and this prints a proper error message instead of segfault. Reviewers: dschuff, sbc100, sunfish Subscribers: jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64322 llvm-svn: 365731
* [SCEV] teach SCEV symbolical execution about overflow intrinsics folding.Chen Zheng2019-07-112-1/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D64422 llvm-svn: 365726
* Replace three "strip & accumulate" implementations with a single oneJohannes Doerfert2019-07-113-97/+32
| | | | | | | | | | | This patch replaces the three almost identical "strip & accumulate" implementations for constant pointer offsets with a single one, combining the respective functionalities. The old interfaces are kept for now. Differential Revision: https://reviews.llvm.org/D64468 llvm-svn: 365723
* [X86] Don't convert 8 or 16 bit ADDs to LEAs on Atom in FixupLEAPass.Craig Topper2019-07-111-12/+15
| | | | | | | | | | | | | | | | | We use the functions that convert to three address to do the conversion, but changing an 8 or 16 bit will cause it to create a virtual register. This can't be done after register allocation where this pass runs. I've switched the pass completely to a white list of instructions that can be converted to LEA instead of a blacklist that was incorrect. This will avoid surprises if we enhance the three address conversion function to include additional instructions in the future. Fixes PR42565. llvm-svn: 365720
* [AMDGPU] gfx908 atomic fadd and atomic pk_faddStanislav Mekhanoshin2019-07-117-4/+195
| | | | | | Differential Revision: https://reviews.llvm.org/D64435 llvm-svn: 365717
* [AMDGPU] gfx908 dot instruction supportStanislav Mekhanoshin2019-07-111-0/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D64431 llvm-svn: 365715
* [SDAG] commute setcc operands to match a subtractSanjay Patel2019-07-101-0/+11
| | | | | | | | | | | | | | | | | | | If we have: R = sub X, Y P = cmp Y, X ...then flipping the operands in the compare instruction can allow using a subtract that sets compare flags. Motivated by diffs in D58875 - not sure if this changes anything there, but this seems like a good thing independent of that. There's a more involved version of this transform already in IR (in instcombine although that seems misplaced to me) - see "swapMayExposeCSEOpportunities()". Differential Revision: https://reviews.llvm.org/D63958 llvm-svn: 365711
* NFC: Pass DataLayout into isBytewiseValueVitaly Buka2019-07-103-12/+14
| | | | | | | | | | | | | | | | | | Summary: We will need to handle IntToPtr which I will submit in a separate patch as it's not going to be NFC. Reviewers: eugenis, pcc Reviewed By: eugenis Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D63940 llvm-svn: 365709
* [X86] Add patterns with and_flag_nocf for BLSI and TBM instructions.Craig Topper2019-07-101-6/+19
| | | | | | Fixes similar issues to r352306. llvm-svn: 365705
* [X86] Add BLSR and BLSMSK to isUseDefConvertible.Craig Topper2019-07-101-1/+6
| | | | | | | | | | | | Unfortunately subo formation in CGP prevents obvious ways of testing this. But we already have BLSI in here and the flag behavior is well understood. Might become more useful if we improve PR42571. llvm-svn: 365702
* [NFC]Fix IR/MC depency issue for function descriptor SDAG implementationDavid Tenty2019-07-101-44/+35
| | | | | | | | | | | | | | | | | | Summary: llvm/IR/GlobalValue.h can't be included in MC, that creates a circular dependency between MC and IR libraries. This circular dependency is causing an issue for build system that enforce layering. Author: Xiangling_L Reviewers: sfertile, jasonliu, hubert.reinterpretcast, gribozavr Reviewed By: gribozavr Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64445 llvm-svn: 365701
* [X86] Remove unused variable. NFCCraig Topper2019-07-101-1/+0
| | | | llvm-svn: 365697
* [AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and ↵Amara Emerson2019-07-102-4/+18
| | | | | | | | | | | | | | | | | | | | unknown values. Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes obstruct attempts to find constant source values. These usually come about when try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough about this operation allows the CBZ/CBNZ optimization to catch more cases. This change also improves the case where we can't find a constant source at all. Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit a cmp and conditional branch, saving an instruction. The cumulative code size improvement of this change plus D64354 is 5.5% geomean on arm64 CTMark -O0. Differential Revision: https://reviews.llvm.org/D64377 llvm-svn: 365690
* Revert "[ELF] Loose a condition for relocation with a symbol"Nikola Prica2019-07-101-0/+5
| | | | | | | | This reverts commit 8507eca1647118e73435b0ce1de8a1952a021d01. Reveting due to some suspicious failurse in santizer-x86_64-linux. llvm-svn: 365685
* [GlobalISel][AArch64] Use getOpcodeDef instead of findMIFromRegJessica Paquette2019-07-101-14/+3
| | | | | | | | | | | | | | | | Some minor cleanup. This function in Utils does the same thing as `findMIFromReg`. It also looks through copies, which `findMIFromReg` didn't. Delete `findMIFromReg` and use `getOpcodeDef` instead. This only happens in `tryOptVectorDup` right now. Update opt-shuffle-splat to show that we can look through the copies now, too. Differential Revision: https://reviews.llvm.org/D64520 llvm-svn: 365684
* [GlobalISel][AArch64][NFC] Use getDefIgnoringCopies from Utils where we canJessica Paquette2019-07-101-22/+5
| | | | | | | | | | | | | | | | | There are a few places where we walk over copies throughout AArch64InstructionSelector.cpp. In Utils, there's a function that does exactly this which we can use instead. Note that the utility function works with the case where we run into a COPY from a physical register. We've run into bugs with this a couple times, so using it should defend us from similar future bugs. Also update opt-fold-compare.mir to show that we still handle physical registers properly. Differential Revision: https://reviews.llvm.org/D64513 llvm-svn: 365683
* Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces"David Greene2019-07-108-38/+37
| | | | | | | | This broke some PPC prefetching tests. This reverts commit 9fdfb045ae8bb643ab0d0455dcf9ecaea3b1eb3c. llvm-svn: 365680
* Move three folds for FADD, FSUB and FMUL in the DAG combiner away from ↵Michael Berg2019-07-101-4/+4
| | | | | | | | | | | | | | | | Unsafe to more aligned checks that reflect context Summary: Unsafe does not map well alone for each of these three cases as it is missing NoNan context when accessed directly with clang. I have migrated the fold guards to reflect the expectations of handing nan and zero contexts directly (NoNan, NSZ) and some tests with it. Unsafe does include NSZ, however there is already precedent for using the target option directly to reflect that context. Reviewers: spatel, wristow, hfinkel, craig.topper, arsenm Reviewed By: arsenm Subscribers: michele.scandale, wdng, javed.absar Differential Revision: https://reviews.llvm.org/D64450 llvm-svn: 365679
* [System Model] [TTI] Update cache and prefetch TTI interfacesDavid Greene2019-07-108-37/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Adding a default "no information" subtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default subtarget implementation essentially returns "no information" for these interfaces. None of the existing users of TTI will hit that implementation because they define their own custom TTI implementations and won't use the BasicTTIImpl implementations. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 365676
* Recommit "[CommandLine] Remove OptionCategory and SubCommand caches from the ↵Don Hinton2019-07-101-59/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | Option class." Previously reverted in 364141 due to buildbot breakage, and fixed here by making GeneralCategory global a ManagedStatic. Summary: This change processes `OptionCategory`s and `SubCommand`s as they are seen instead of caching them in the Option class and processing them later. Doing so simplifies the work needed to be done by the Global parser and significantly reduces the size of the Option class to a mere 64 bytes. Removing the `OptionCategory` cache saved 24 bytes, and removing the `SubCommand` cache saved an additional 48 bytes, for a total of a 72 byte reduction. Reviewed By: serge-sans-paille Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D62105 llvm-svn: 365675
* [X86] EltsFromConsecutiveLoads - clean up element size calcs. NFCI.Simon Pilgrim2019-07-101-14/+12
| | | | | | Determine the element/load size calculations earlier and assert that they are whole bytes in size. llvm-svn: 365674
* [LoopRotate + MemorySSA] Keep an <instruction-cloned instruction> map.Alina Sbirlea2019-07-101-4/+8
| | | | | | | | | | | | | | | | | | | | Summary: The map kept in loop rotate is used for instruction remapping, in order to simplify the clones of instructions. Thus, if an instruction can be simplified, its simplified value is placed in the map, even when the clone is added to the IR. MemorySSA in contrast needs to know about that clone, so it can add an access for it. To resolve this: keep a different map for MemorySSA. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63680 llvm-svn: 365672
* [ORC] Add custom IR compiler configuration to LLJITBuilder to enable obj caches.Lang Hames2019-07-101-44/+35
| | | | | | | | | | LLJITBuilder now has a setCompileFunctionCreator method which can be used to construct a CompileFunction for the LLJIT instance being created. The motivating use-case for this is supporting ObjectCaches, which can now be set up at compile-function construction time. To demonstrate this an example project, LLJITWithObjectCache, is included. llvm-svn: 365671
* [TargetLowering] support BlockAddress as "i" inline asm constraintNick Desaulniers2019-07-101-0/+7
| | | | | | | | | | | | | | | | | | | | Summary: This allows passing address of labels to inline assembly "i" input constraints. Fixes pr/42502. Reviewers: ostannard Reviewed By: ostannard Subscribers: void, echristo, nathanchance, ostannard, javed.absar, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D64167 llvm-svn: 365664
* MC: AArch64: Add support for pg_hi21_nc relocation specifier.Peter Collingbourne2019-07-101-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D64455 llvm-svn: 365661
* [CodeExtractor] Fix sinking of allocas with multiple bitcast uses (PR42451)Vedant Kumar2019-07-101-13/+29
| | | | | | | | | | | | | | An alloca which can be sunk into the extraction region may have more than one bitcast use. Move these uses along with the alloca to prevent use-before-def. Testing: check-llvm, stage2 build of clang Fixes llvm.org/PR42451. Differential Revision: https://reviews.llvm.org/D64463 llvm-svn: 365660
* [CodeExtractor] Simplify findAllocas, NFCVedant Kumar2019-07-101-73/+91
| | | | | | | | | Split getLifetimeMarkers out into its own method and have it return a struct. Differential Revision: https://reviews.llvm.org/D64467 llvm-svn: 365659
* GlobalISel: Legalization for G_FMINNUM/G_FMAXNUMMatt Arsenault2019-07-104-1/+128
| | | | llvm-svn: 365658
* GlobalISel: Define the full family of FP min/max instructionsMatt Arsenault2019-07-101-0/+8
| | | | llvm-svn: 365657
* [X86] EltsFromConsecutiveLoads - remove duplicate check for element size. NFCI.Simon Pilgrim2019-07-101-6/+0
| | | | | | We've already checked that each element is the correct contributory size for VT when we inspect the elements for Undef/Zero/Load. llvm-svn: 365656
* [X86] EltsFromConsecutiveLoads - ensure element reg/store sizes are the same ↵Simon Pilgrim2019-07-101-3/+5
| | | | | | | | size. NFCI. This renames the type so it doesn't sound like its based off the load size - as we're moving towards supporting combining loads of different sizes. llvm-svn: 365655
* AMDGPU: Serialize mode from MachineFunctionInfoMatt Arsenault2019-07-103-1/+32
| | | | llvm-svn: 365653
* [PatternMatch] Generalize m_SpecificInt_ULT() to take ICmpInst::PredicateRoman Lebedev2019-07-102-2/+4
| | | | | | | As discussed in the original review, this may be useful, so let's just do it. llvm-svn: 365652
* [Remarks] Add cl::Hidden to -remarks-yaml-string-tableFrancis Visoiu Mistrih2019-07-101-2/+3
| | | | | | It was showing up in a lot of unrelated tools. llvm-svn: 365647
* [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32Jay Foad2019-07-101-7/+8
| | | | | | | | | | | | | | | | | Summary: D59191 added support for these modifiers in the assembler and disassembler. This patch just teaches instruction selection that it can use them. Reviewers: arsenm, tstellar Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64497 llvm-svn: 365640
* [InstCombine] pow(C,x) -> exp2(log2(C)*x)David Bolvansky2019-07-101-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Transform pow(C,x) To exp2(log2(C)*x) if C > 0, C != inf, C != NaN (and C is not power of 2, since we have some fold for such case already). log(C) is folded by the compiler and exp2 is much faster to compute than pow. Reviewers: spatel, efriedma, evandro Reviewed By: evandro Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64099 llvm-svn: 365637
* [X86] EltsFromConsecutiveLoads - cleanup Zero/Undef/Load element collection. ↵Simon Pilgrim2019-07-101-12/+17
| | | | | | NFCI. llvm-svn: 365628
* [MIPS GlobalISel] Select float and double phiPetar Avramovic2019-07-101-4/+25
| | | | | | | | Select float and double phi for MIPS32. Differential Revision: https://reviews.llvm.org/D64420 llvm-svn: 365627
OpenPOWER on IntegriCloud