summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Cloning: Copy comdats when cloning globals.Peter Collingbourne2017-01-181-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D28838 llvm-svn: 292430
* Fix up a comment. NFC.Michael Kuperstein2017-01-181-1/+0
| | | | llvm-svn: 292425
* [LV] Allow reductions that have several uses outside the loopMichael Kuperstein2017-01-182-10/+14
| | | | | | | | | | | We currently check whether a reduction has a single outside user. We don't really need to require that - we just need to make sure a single value is used externally. The number of external users of that value shouldn't actually matter. Differential Revision: https://reviews.llvm.org/D28830 llvm-svn: 292424
* [AArch64] Generate literals by the little endEvandro Menezes2017-01-183-26/+26
| | | | | | | | | | | ARM seems to prefer that long literals be formed from their little end in order to promote the fusion of the instrs pairs MOV/MOVK and MOVK/MOVK on Cortex A57 and others (v. "Cortex A57 Software Optimisation Guide", section 4.14). Differential revision: https://reviews.llvm.org/D28697 llvm-svn: 292422
* [NewGVN] We don't use postdom info anymore. Update.Davide Italiano2017-01-181-1/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D28842 llvm-svn: 292421
* [ThinLTO] Add a recursive step in Metadata lazy-loadingMehdi Amini2017-01-181-4/+17
| | | | | | | | | | | | | | | | | | | | | | Summary: Without this, we're stressing the RAUW of unique nodes, which is a costly operation. This is intended to limit the number of RAUW, and is very effective on the total link-time of opt with ThinLTO, before: real 4m4.587s user 15m3.401s sys 0m23.616s after: real 3m25.261s user 12m22.132s sys 0m24.152s Reviewers: tejohnson, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28751 llvm-svn: 292420
* [AMDGPU] Do not allow register coalescer to create big superregsStanislav Mekhanoshin2017-01-182-0/+27
| | | | | | | | | | | | | | | | Limit register coalescer by not allowing it to artificially increase size of registers beyond dword. Such super-registers are in fact register sequences and not distinct HW registers. With more super-regs we would need to allocate adjacent registers and constraint regalloc more than needed. Moreover, our super registers are overlapping. For instance we have VGPR0_VGPR1_VGPR2, VGPR1_VGPR2_VGPR3, VGPR2_VGPR3_VGPR4 etc, which complicates registers allocation even more, resulting in excessive spilling. Differential Revision: https://reviews.llvm.org/D28782 llvm-svn: 292413
* GlobalISel: Implement narrowing for G_STOREJustin Bogner2017-01-181-2/+23
| | | | | | | Legalize stores of types that are too wide by breaking them up into sequences of smaller stores. llvm-svn: 292412
* Don't create a comdat group for a dropped def with initializerTeresa Johnson2017-01-181-2/+5
| | | | | | | | | | | | | | | | | | Non-prevailing weak/linkonce odr symbols will be dropped by ThinLTO to available_externally when possible. If they had an initializer in the global_ctors list, a comdat group was being created. This code already had logic to skip available_externally defs, but now the EliminateAvailableExternally pass will drop these symbols to declarations earlier. Change the check to skip all declarations for linker (which includes available_externally along with declarations). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28737 llvm-svn: 292408
* Revert 292404 due to buildbot failures.Kirill Bobyrev2017-01-182-5/+5
| | | | llvm-svn: 292407
* [X86] Minor code cleanup to fix several clang-tidy warnings. NFCKirill Bobyrev2017-01-182-5/+5
| | | | llvm-svn: 292404
* [ARM] Create SubtargetFeatures from build attrsSam Parker2017-01-181-42/+153
| | | | | | | | | | | An ELFObjectFile can now create SubtargetFeatures from the available ARM build attributes, in a similar manner to MIPS. I've moved the MIPS code into its own function and the ARM handler also has a separate function. Differential Revision: https://reviews.llvm.org/D28291 llvm-svn: 292403
* raw_fd_ostream: Make file handles non-inheritable by defaultPavel Labath2017-01-181-2/+2
| | | | | | | | | | | | | | | | Summary: This makes the file descriptors on unix platform non-inheritable (O_CLOEXEC). There is no change in behavior on windows, as the handles were already non-inheritable there. Reviewers: rnk, rafael Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D28854 llvm-svn: 292401
* [Assembler] Fix crash when assembling .quad for AArch32.Chad Rosier2017-01-181-1/+2
| | | | | | | | | | | A 64-bit relocation does not exist in 32-bit ARMELF. Report an error instead of crashing. PR23870 Patch by Sanne Wouda (sanwou01). Differential Revision: https://reviews.llvm.org/D28851 llvm-svn: 292373
* [thumb,framelowering] Reset NoVRegs in Thumb1FrameLowering::emitPrologue.Florian Hahn2017-01-182-0/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: In this function, virtual registers can be introduced (for example through calls to emitThumbRegPlusImmInReg). doScavengeFrameVirtualRegs will replace those virtual registers with concrete registers later on in PrologEpilogInserter, which sets NoVRegs again. This patch fixes the Codegen/Thumb/segmented-stacks.ll test case which failed with expensive checks. https://llvm.org/bugs/show_bug.cgi?id=27484 Reviewers: rnk, bkramer, olista01 Reviewed By: olista01 Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D28829 llvm-svn: 292372
* [InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shufflesSimon Pilgrim2017-01-181-1/+4
| | | | | | Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded. llvm-svn: 292371
* Re-revert: [globalisel] Tablegen-erate current Register Bank InformationDaniel Sanders2017-01-188-86/+191
| | | | | | | More missing guards. My build didn't notice it due to a stale file left over from a Global ISel build. llvm-svn: 292369
* Re-commit: [globalisel] Tablegen-erate current Register Bank InformationDaniel Sanders2017-01-188-191/+86
| | | | | | | | | | | | | | | | | | | | | | | Summary: Adds a RegisterBank tablegen class that can be used to declare the register banks and an associated tablegen pass to generate the necessary code. Changes since last commit: The new tablegen pass is now correctly guarded by LLVM_BUILD_GLOBAL_ISEL and this should fix the buildbots however it may not be the whole fix. The previous buildbot failures suggest there may be a memory bug lurking that I'm unable to reproduce (including when using asan) or spot in the source. If they re-occur on this commit then I'll need assistance from the bot owners to track it down. Reviewers: t.p.northover, ab, rovka, qcolombet Reviewed By: qcolombet Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka Differential Revision: https://reviews.llvm.org/D27338 llvm-svn: 292367
* [ARM] Create objdump subtarget from build attrsSam Parker2017-01-183-51/+144
| | | | | | | | | | | Enable an ELFObjectFile to read the its arm build attributes to produce a target triple with a specific ARM architecture. llvm-objdump now uses this functionality to automatically produce a more accurate target. Differential Revision: https://reviews.llvm.org/D28769 llvm-svn: 292366
* [InstCombine] Remove unnecessary intrinsics demanded elts handlingSimon Pilgrim2017-01-181-22/+2
| | | | | | As discussed on D28777 - we don't need to handle 'all element' shuffles inside InstCombiner::visitCallInst as InstCombiner::SimplifyDemandedVectorElts will do everything we need. llvm-svn: 292365
* [X86] Improve mul combine for negative multiplayer (2^c - 1)Michael Zuckerman2017-01-181-16/+31
| | | | | | | | | | | This patch improves the mul instruction combine function (combineMul) by adding new layer of logic. In this patch, we are adding the ability to fold (mul x, -((1 << c) -1)) or (mul x, -((1 << c) +1)) into (neg(X << c) -x) or (neg((x << c) + x) respective. Differential Revision: https://reviews.llvm.org/D28232 llvm-svn: 292358
* Revert "[XRay][Arm] Repair XRay table emission on Arm32 and add tests to ↵Renato Golin2017-01-181-3/+0
| | | | | | | | | | | identify such problem earlier" This reverts commit r292210, as it broke the Thumb buldbot with: clang-5.0: error: the clang compiler does not support '-fxray-instrument on thumbv7-unknown-linux-gnueabihf'. llvm-svn: 292357
* [SystemZ] Proper handling of undef flag while expanding pseudo.Jonas Paulsson2017-01-182-7/+11
| | | | | | | | During post-RA pseudo expansion, an 'undef' flag of the source operand should be propagated by emitGRX32Move(). Review: Ulrich Weigand llvm-svn: 292353
* [X86] Fix for bugzilla 31576 - add support for "data32" instruction prefixMarina Yatsina2017-01-183-2/+18
| | | | | | | | | | | This patch fixes bugzilla 31576 (https://llvm.org/bugs/show_bug.cgi?id=31576). "data32" instruction prefix was not defined in the llvm. An exception had to be added to the X86 tablegen and AsmPrinter because both "data16" and "data32" are encoded to 0x66 (but in different modes). Differential Revision: https://reviews.llvm.org/D28468 llvm-svn: 292352
* [LoopDeletion] (cleanup, NFC) Fix one more local variable that didn'tChandler Carruth2017-01-181-2/+2
| | | | | | | | | follow LLVM's naming conventions while I'm here. Again, sorry I didn't spot this earlier to coalesce with other cleanup changes. llvm-svn: 292333
* [PM] Teach LoopDeletion to correctly update the LPM when loops areChandler Carruth2017-01-181-10/+20
| | | | | | | | | deleted. I've expanded its test coverage a bit including adding one test that will crash clearly without this change. llvm-svn: 292332
* DAG: Consider nnan in isKnownNeverNaNMatt Arsenault2017-01-181-0/+3
| | | | llvm-svn: 292328
* Revert rL292292 since it causes a SEGV on sanitizer-x86_64-linux-fuzzer ↵Wei Mi2017-01-181-171/+0
| | | | | | build bot. llvm-svn: 292327
* [libFuzzer] remove stale codeKostya Serebryany2017-01-184-131/+4
| | | | llvm-svn: 292325
* [WebAssembly] Update grow_memory's return type.Dan Gohman2017-01-181-2/+2
| | | | | | | The grow_memory instruction now returns the previous memory size. Add the return type to the LLVM intrinsic. llvm-svn: 292322
* MIRParser: Allow regclass specification on operandMatthias Braun2017-01-183-8/+83
| | | | | | | | | | | You can now define the register class of a virtual register on the operand itself avoiding the need to use a "registers:" block. Example: "%0:gr64 = COPY %rax" Differential Revision: https://reviews.llvm.org/D22398 llvm-svn: 292321
* [Target, Transforms] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-01-183-40/+105
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 292320
* [libFuzzer] exit(1) on failed mergeKostya Serebryany2017-01-182-0/+10
| | | | llvm-svn: 292319
* [NVPTX] Support global variables of integer type larger than i64.Justin Lebar2017-01-181-1/+14
| | | | | | | | | | Reviewers: tra, majnemer Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28825 llvm-svn: 292316
* Skip loop header while we can when computing loop safety infoXin Tong2017-01-181-1/+4
| | | | llvm-svn: 292310
* [NVPTX] Standardize asm printer on "foo \tbar".Justin Lebar2017-01-182-218/+218
| | | | | | | Some instructions were printed as "foo\tbar", but most are printed as "foo \bar". Standardize on the latter form. llvm-svn: 292306
* [NVPTX] Clean up nested !strconcat calls.Justin Lebar2017-01-182-99/+68
| | | | | | | !strconcat is a variadic function; it will concatenate an arbitrary number of strings. There's no need to nest it. llvm-svn: 292305
* [NVPTX] Implement min/max in tablegen, rather than with custom DAGComine logic.Justin Lebar2017-01-182-70/+16
| | | | | | | | | | | | | | | | | | | | | | | Summary: This change also lets us use max.{s,u}16. There's a vague warning in a test about this maybe being less efficient, but I could not come up with a case where the resulting SASS (sm_35 or sm_60) was different with or without max.{s,u}16. It's true that nvcc seems to emit only max.{s,u}32, but even ptxas 7.0 seems to have no problem generating efficient SASS from max.{s,u}16 (the casts up to i32 and back down to i16 seem to be implicit and nops, happening via register aliasing). In the absence of evidence, better to have fewer special cases, emit more straightforward code, etc. In particular, if a new GPU has 16-bit min/max instructions, we want to be able to use them. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28732 llvm-svn: 292304
* [NVPTX] Lower integer absolute value idiom to abs instruction.Justin Lebar2017-01-181-0/+12
| | | | | | | | | | | | Summary: Previously we lowered it literally, to shifts and xors. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28722 llvm-svn: 292303
* [NVPTX] Improve lowering of llvm.ctpop.Justin Lebar2017-01-181-5/+9
| | | | | | | | | | | | | | | | | | Summary: Avoid an unnecessary conversion operation when using the result of ctpop.i32 or ctpop.i16 as an i32, as in both cases the ptx instruction we run returns an i32. (Previously if we used the value as an i32, we'd do an unnecessary zext+trunc.) Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D28721 llvm-svn: 292302
* [NVPTX] Add lowering for llvm.bitreverse.Justin Lebar2017-01-182-0/+13
| | | | | | | | | | Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28720 llvm-svn: 292301
* [NVPTX] Improve lowering of llvm.ctlz.Justin Lebar2017-01-182-9/+29
| | | | | | | | | | | | | | | | | Summary: * Disable "ctlz speculation", which inserts a branch on every ctlz(x) which has defined behavior on x == 0 to check whether x is, in fact zero. * Add DAG patterns that avoid re-truncating or re-expanding the result of the 16- and 64-bit ctz instructions. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D28719 llvm-svn: 292299
* [libFuzzer] add ATTRIBUTE_NO_SANITIZE_MEMORY to sanitizer hooksKostya Serebryany2017-01-171-0/+14
| | | | llvm-svn: 292295
* Introduce -unroll-partial-threshold to separate PartialThreshold from ↵Dehao Chen2017-01-171-5/+9
| | | | | | | | | | | | | | | | Threshold in loop unorller. Summary: Partial unrolling should have separate threshold with full unrolling. Reviewers: efriedma, mzolotukhin Reviewed By: efriedma, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28831 llvm-svn: 292293
* [RegisterCoalescing] Remove partial redundent copy.Wei Mi2017-01-171-0/+171
| | | | | | | | | | | | The patch is to solve the performance problem described in PR27827. Register coalescing sometimes cannot remove a copy because of interference. But if we can find a reverse copy in one of the predecessor block of the copy, the copy is partially redundent and we may remove the copy partially by moving it to the predecessor block without the reverse copy. Differential Revision: https://reviews.llvm.org/D28585 llvm-svn: 292292
* [libfuzzer] fixing collected pc addresses for coverageMike Aizatsky2017-01-175-21/+33
| | | | | | | | | | | | Summary: The causes google/ossfuzz#84 Reviewers: kcc Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D28827 llvm-svn: 292289
* [libFuzzer] use table of recent compares for memcmp/strcmp (to unify the ↵Kostya Serebryany2017-01-1710-68/+94
| | | | | | code between cmp and memcmp handling) llvm-svn: 292287
* [libFuzzer] copy the options inside MutationDispatcher to avoid ↵Kostya Serebryany2017-01-171-1/+2
| | | | | | use-after-scope in mutator tests llvm-svn: 292286
* GlobalISel: fix comparison order for G_FCMPTim Northover2017-01-171-2/+2
| | | | | | As with G_ICMP we'd written the CSET instructions backwards. llvm-svn: 292285
* GlobalISel: add callseq instructions to record stack usageTim Northover2017-01-171-1/+10
| | | | llvm-svn: 292284
OpenPOWER on IntegriCloud