summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Allow suppressing host and target info in VersionPrinterXin Tong2017-04-191-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: VersionPrinter by default outputs information about the Host CPU and Default target. Printing this information requires linking in a large amount of data, such as supported target triples as C strings, which in turn bloats the binary size. Enable a new CMake option LLVM_VERSION_PRINTER_SHOW_HOST_TARGET_INFO which controls printing of the host and target info. This allows the target triple names to be dead-code stripped. This is a nice win for LLVM clients that wish to minimize their binary size, such as graphics drivers. By default this is ON, so there is no change in the default behavior. Clients who wish to suppress this printing can do so by setting this option to off via CMake. A test app on Linux that uses ParseCommandLineOptions() shows a binary size reduction of 23KB (from 149K to 126K) for a Release build, and 24KB (from 135K to 111K) in a MinSizeRel build. Reviewers: klimek, beanz, bogner, chandlerc, compnerd Reviewed By: compnerd Patch by pammon (Peter Ammon) ! Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30904 llvm-svn: 300630
* [AVR] Fix the buildDylan McKay2017-04-181-1/+1
| | | | | | 'PointerSize' was renamed to 'CodePointerSize'. llvm-svn: 300629
* [ConstantRange] Optimize APInt creation in getSignedMax/getSignedMin.Craig Topper2017-04-181-8/+8
| | | | | | | | | | We were creating an APInt at the top of these methods that isn't always returned. For ranges wider than 64-bits this results in an allocation and deallocation when its not used. In getSignedMax we were creating Upper-1 to use in a compare and then creating it again for a return value. The compiler is unable to determine that these can be shared. So help it out and create the Upper-1 in a temporary that can be reused. This provides a little compile time improvement. llvm-svn: 300621
* Fix crash in AttributeList::addAttributes, add testReid Kleckner2017-04-181-0/+3
| | | | llvm-svn: 300614
* Add a getPointerOperandType() helper to LoadInst and StoreInst; NFCSanjoy Das2017-04-184-10/+7
| | | | | | I will use this in a later change. llvm-svn: 300613
* [MemoryBuiltins] Add isMallocOrCallocLikeFn so BasicAA can check for both at ↵Craig Topper2017-04-183-3/+11
| | | | | | | | | | | | | | the same time BasicAA wants to know if a function is either a malloc or calloc like function. Currently we have to check both separately. This means both calls check if its an intrinsic, query TLI, check the nobuiltin attribute, scan the AllocationFnData, etc. This patch adds a isMallocOrCallocLikeFn so we can go through all of the checks once per call. This also changes the one other location I saw that called both together. Differential Revision: https://reviews.llvm.org/D32188 llvm-svn: 300608
* [LoopReroll] Prefer hasNUses/hasNUses or more as they're cheaper. NFCI.Davide Italiano2017-04-181-2/+2
| | | | llvm-svn: 300607
* DAG: Make mayBeEmittedAsTailCall parameter constMatt Arsenault2017-04-1810-11/+11
| | | | llvm-svn: 300603
* Fix typoMatt Arsenault2017-04-181-1/+1
| | | | llvm-svn: 300597
* AMDGPU: Make MFI fields privateMatt Arsenault2017-04-182-6/+8
| | | | llvm-svn: 300596
* [MemoryBuiltins] Use ImmutableCallSite instead of CallSite to remove a ↵Craig Topper2017-04-181-4/+4
| | | | | | const_cast and const correct. NFCI llvm-svn: 300585
* NewGVN: Fix memory congruence verification. The return true should be a ↵Daniel Berlin2017-04-181-8/+8
| | | | | | return false. Merge the appropriate if statements so it doesn't happen again. llvm-svn: 300584
* [X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64.Chih-Hung Hsieh2017-04-182-3/+6
| | | | | | | | | | Android x86_64 target uses f128 type and stores f128 values in %xmm* registers. SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value from f128 to i128. Differential Revision: http://reviews.llvm.org/D32102 llvm-svn: 300583
* [APInt] Inline the single word case of lshrInPlace similar to what we do for ↵Craig Topper2017-04-181-9/+1
| | | | | | <<=. llvm-svn: 300577
* [SLP vectorizer] Allow phi node reordering in tryToVectorizeList.Easwaran Raman2017-04-181-3/+9
| | | | | | | | | | | | | | | | | In tryToVectorizeList, under a very limited circumstance (when entered from tryToVectorizePair), the values may be reordered (swapped) and the SLP tree is built with the new order. This extends that to the case when starting from phis in vectorizeChainsInBlock when there are exactly two phis. The textual order of phi nodes shouldn't really matter. Without this change, the loop body in the accompnaying test case is fully vectorized when we swap the orde of the phis but not with this order. While this doesn't solve the phi-ordering problem in a general way (for more than 2 phis), this is simple fix that piggybacks on an existing mechanism and is useful in cases like multiplying two complex numbers. Differential revision: https://reviews.llvm.org/D32065 llvm-svn: 300574
* [X86] Use for-range loop. NFCI.Simon Pilgrim2017-04-181-2/+2
| | | | llvm-svn: 300567
* [APInt] Use lshrInPlace to replace lshr where possibleCraig Topper2017-04-1817-55/+60
| | | | | | | | | | This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566
* NewGVN: Don't waste time value numbering unreachable blocksDaniel Berlin2017-04-181-17/+6
| | | | llvm-svn: 300565
* [DAG] Improve store merge candidate pruning.Nirav Dave2017-04-181-0/+21
| | | | | | | | | | | | | | Remove non-consecutive stores from store merge candidate search as they cannot be merged and will prevent us from finding subsequent mergeable store cases. Reviewers: jyknight, bogner, javed.absar, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32086 llvm-svn: 300561
* LoopRerollPass: Prefer Value::hasOneUse() over Value::getNumUses(). NFC.Zvi Rackover2017-04-181-1/+1
| | | | | | getNumUses() can be more expensive as it iterates over all list's elements. llvm-svn: 300558
* [LV] Cache block mask valuesGil Rapaport2017-04-181-7/+17
| | | | | | | | | | | | This patch is part of D28975's breakdown. Add caching for block masks similar to the cache already used for edge masks, replacing generation per user with reusing the first generated value which dominates all uses. Differential Revision: https://reviews.llvm.org/D32054 llvm-svn: 300557
* [ConstantRange] fix doxygen comment formatting; NFCSanjay Patel2017-04-181-76/+0
| | | | llvm-svn: 300554
* [GVNHoist] Mark GlobalsAA as preserved by GVNHoist.Nikolai Bozhenov2017-04-181-0/+3
| | | | | | | | | | | | | Reviewers: sebpop, hiraditya Reviewed By: sebpop Subscribers: n.bozhenov, llvm-commits Differential Revision: https://reviews.llvm.org/D32158 Patch by Andrei Elovikov <andrei.elovikov@intel.com> llvm-svn: 300552
* [ARM] Add hardware build attributes in assemblerOliver Stannard2017-04-183-164/+189
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the assembler, we should emit build attributes based on the target selected with command-line options. This matches the GNU assembler's behaviour. We only do this for build attributes which describe the hardware that is expected to be available, not the ones that describe ABI compatibility. This is done by moving some of the attribute emission code to ARMTargetStreamer, so that it can be shared between the assembly and code-generation code paths. Since the assembler only creates a MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to check raw features, and not use the convenience functions in ARMSubtarget. If different attributes are later specified using the .eabi_attribute directive, then they will take precedence, as happens when the same .eabi_attribute is specified twice. This must be enabled by an option, because we don't want to do this when parsing inline assembly. The attributes would match the ones emitted at the start of the file, so wouldn't actually change the emitted object file, but the extra directives would be added to every inline assembly block when emitting assembly, which we'd like to avoid. The majority of the changes in the build-attributes.ll test are just re-ordering the directives, because the hardware attributes are now emitted before the ABI ones. However, I did fix one bug which I spotted: Tag_CPU_arch_profile was not being emitted for v6M. Differential revision: https://reviews.llvm.org/D31812 llvm-svn: 300547
* [ARM] GlobalISel: Add support for G_SUBDiana Picus2017-04-183-2/+8
| | | | | | | Support G_SUB throughout the GlobalISel pipeline. It is exactly the same as G_ADD, nothing fancy. llvm-svn: 300546
* [SampleProfile] Don't assert when printing the DebugLoc of a branch. NFC.Andrea Di Biagio2017-04-181-2/+4
| | | | llvm-svn: 300544
* [SampleProfile] Skip intrinsic calls when visiting callsites in ↵Andrea Di Biagio2017-04-181-1/+1
| | | | | | | | | | | | | | InlineHotFunctions. Before this patch, we always called method 'findCalleeFunctionSamples()' on intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable candidates for obvious reasons. No functional change intended. Differential Revision: https://reviews.llvm.org/D32008 llvm-svn: 300541
* Revert "[GlobalISel] Support vector-of-pointers in LLT"Kristof Beyls2017-04-184-27/+17
| | | | | | | | | | | | | | | | This reverts r300535 and r300537. The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll produces slightly different code between LLVM versions being built with different compilers. E.g., dependent on the compiler LLVM is built with, either one of the following can be produced: remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement) remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement) Non-determinism like this is clearly a bad thing, so reverting this until I can find and fix the root cause of the non-determinism. llvm-svn: 300538
* Fix gcc build after r300535.Kristof Beyls2017-04-181-8/+8
| | | | llvm-svn: 300537
* [ARM] Check for correct HW div when lowering divmodDiana Picus2017-04-181-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For subtargets that use the custom lowering for divmod, e.g. gnueabi, we used to check if the subtarget has hardware divide and then lower to a div-mul-sub sequence if true, or to a libcall if false. However, judging by the usage of hasDivide vs hasDivideInARMMode, it seems that hasDivide only refers to Thumb. For instance, in the ARMTargetLowering constructor, the code that specifies whether to use libcalls for (S|U)DIV looks like this: bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide() : Subtarget->hasDivideInARMMode(); In the case of divmod for arm-gnueabi, using only hasDivide() to determine what to do means that instead of lowering to __aeabi_idivmod to get the remainder, we lower to div-mul-sub and then further lower the div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but not in Thumb, we generate a libcall instead of using it (this is not an issue in practice since AFAICT none of the cores that we support have hardware divide in ARM but not Thumb). This patch fixes the code dealing with custom lowering to take into account the mode (Thumb or ARM) when deciding whether or not hardware division is available. Differential Revision: https://reviews.llvm.org/D32005 llvm-svn: 300536
* [GlobalISel] Support vector-of-pointers in LLTKristof Beyls2017-04-184-17/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes PR32471. As comment 10 on that bug report highlights (https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a few different defendable design tradeoffs that could be made, including not representing pointers at all in LLT. I decided to go for representing vector-of-pointer as a concept in LLT, while keeping the size of the LLT type 64 bits (this is an increase from 48 bits before). My rationale for keeping pointers explicit is that on some targets probably it's very handy to have the distinction between pointer and non-pointer (e.g. 68K has a different register bank for pointers IIRC). If we keep a scalar pointer, it probably is easiest to also have a vector-of-pointers to keep LLT relatively conceptually clean and orthogonal, while we don't have a very strong reason to break that orthogonality. Once we gain more experience on the use of LLT, we can of course reconsider this direction. Rejecting vector-of-pointer types in the IRTranslator is also an option to avoid the crash reported in PR32471, but that is only a very short-term solution; also needs quite a bit of code tweaks in places, and is probably fragile. Therefore I didn't consider this the best option. llvm-svn: 300535
* test commitLeslie Zhai2017-04-181-0/+1
| | | | llvm-svn: 300532
* [APInt] Cleanup the reverseBits slow case a little.Craig Topper2017-04-181-6/+4
| | | | | | Use lshrInPlace. Use single bit extract and operator|=(uint64_t) to avoid a few temporary APInts. llvm-svn: 300527
* [APInt] Make operator<<= shift in place. Improve the implementation of ↵Craig Topper2017-04-181-78/+24
| | | | | | tcShiftLeft and use it to implement operator<<=. llvm-svn: 300526
* PR32382: Fix emitting complex DWARF expressions.Adrian Prantl2017-04-1819-139/+233
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DWARF specification knows 3 kinds of non-empty simple location descriptions: 1. Register location descriptions - describe a variable in a register - consist of only a DW_OP_reg 2. Memory location descriptions - describe the address of a variable 3. Implicit location descriptions - describe the value of a variable - end with DW_OP_stack_value & friends The existing DwarfExpression code is pretty much ignorant of these restrictions. This used to not matter because we only emitted very short expressions that we happened to get right by accident. This patch makes DwarfExpression aware of the rules defined by the DWARF standard and now chooses the right kind of location description for each expression being emitted. This would have been an NFC commit (for the existing testsuite) if not for the way that clang describes captured block variables. Based on how the previous code in LLVM emitted locations, DW_OP_deref operations that should have come at the end of the expression are put at its beginning. Fixing this means changing the semantics of DIExpression, so this patch bumps the version number of DIExpression and implements a bitcode upgrade. There are two major changes in this patch: I had to fix the semantics of dbg.declare for describing function arguments. After this patch a dbg.declare always takes the *address* of a variable as the first argument, even if the argument is not an alloca. When lowering a DBG_VALUE, the decision of whether to emit a register location description or a memory location description depends on the MachineLocation — register machine locations may get promoted to memory locations based on their DIExpression. (Future) optimization passes that want to salvage implicit debug location for variables may do so by appending a DW_OP_stack_value. For example: DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8 DBG_VALUE, RAX --> DW_OP_reg0 +0 DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0 All testcases that were modified were regenerated from clang. I also added source-based testcases for each of these to the debuginfo-tests repository over the last week to make sure that no synchronized bugs slip in. The debuginfo-tests compile from source and run the debugger. https://bugs.llvm.org/show_bug.cgi?id=32382 <rdar://problem/31205000> Differential Revision: https://reviews.llvm.org/D31439 llvm-svn: 300522
* Add const to a const method. NFCGeorge Burgess IV2017-04-181-1/+1
| | | | llvm-svn: 300520
* [Target] Use hasOneUse() instead of getNumUses().Davide Italiano2017-04-181-1/+1
| | | | | | | The latter does a liner scan over a linked list, therefore is much more expensive. llvm-svn: 300518
* Object: Shrink the size of irsymtab::Symbol by a word. NFCI.Peter Collingbourne2017-04-171-14/+18
| | | | | | | | | | Instead of storing an UncommonIndex on the Symbol, use a flag bit to store whether the Symbol has an Uncommon. This shrinks Chromium's .bc files (after D32061) by about 1%. Differential Revision: https://reviews.llvm.org/D32070 llvm-svn: 300514
* Build SymbolMap in SampleProfileLoader to help matchin function names with ↵Dehao Chen2017-04-171-1/+31
| | | | | | | | | | | | | | | | suffix. Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline. Reviewers: davidxl, dnovillo, tejohnson Reviewed By: davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31952 llvm-svn: 300507
* [SimplifyCFG] Use hasNUses instead of comparing getNumUses to a constant."Craig Topper2017-04-171-1/+1
| | | | | | The use list is a linked list so getNumUses requires a linear scan through the whole list. hasNUses will stop scanning at N and see if that is the end. llvm-svn: 300505
* [APInt] Merge the multiword code from lshrInPlace and tcShiftRight into a ↵Craig Topper2017-04-171-68/+25
| | | | | | | | | | | | single implementation This merges the two different multiword shift right implementations into a single version located in tcShiftRight. lshrInPlace now calls tcShiftRight for the multiword case. I retained the memmove fast path from lshrInPlace and used a memset for the zeroing. The for loop is basically tcShiftRight's implementation with the zeroing and the intra-shift of 0 removed. Differential Revision: https://reviews.llvm.org/D32114 llvm-svn: 300503
* [WebAssembly] Fix WebAssemblyOptimizeReturned after r300367Jacob Gravelle2017-04-171-1/+1
| | | | | | | | | | | | | | | | Summary: Refactoring changed paramHasAttr(1 + i) to paramHasAttr(0), fix that to paramHasAttr(i). Add more tests to WebAssemblyOptimizeReturned that catch that regression. Reviewers: dschuff Subscribers: jfb, sbc100, llvm-commits Differential Revision: https://reviews.llvm.org/D32136 llvm-svn: 300502
* [SCEV] Fix another unused variable warning in release builds.Benjamin Kramer2017-04-171-0/+1
| | | | llvm-svn: 300500
* Fix an unused variable error in rL300494.Wei Mi2017-04-171-0/+1
| | | | llvm-svn: 300499
* [libFuzzer] experimental option -cleanse_crash: tries to replace all bytes ↵Kostya Serebryany2017-04-175-0/+85
| | | | | | in a crash reproducer with garbage, while still preserving the crash llvm-svn: 300498
* [InstCombine] Matchers work with both ConstExpr and Instructions.Davide Italiano2017-04-171-2/+2
| | | | | | | | | | So, `cast<Instruction>` is not guaranteed to succeed. Change the code so that we create a new constant and use it in the newly created instruction, as it's done in other places in InstCombine. OK'ed by Sanjay/Craig. Fixes PR32686. llvm-svn: 300495
* [SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to preventWei Mi2017-04-171-61/+115
| | | | | | | | | | | | | | | | | | the exponential behavior. The patch is to fix PR32043. Functions getZeroExtendExpr and getSignExtendExpr may call themselves recursively more than once. This is potentially a 2^N complexity behavior. The exponential behavior was not commonly exposed before because of existing global cache mechnism like UniqueSCEVs or some early return mechanism when flags FlagNSW or FlagNUW are seen. However, we still have case which can expose the exponential behavior, like the case in PR32043, so we add a local cache in getZeroExtendExpr and getSignExtendExpr. If the input of the functions -- SCEV and type pair have been seen before, we can find the extended expression directly in the local cache. Differential Revision: https://reviews.llvm.org/D30350 llvm-svn: 300494
* [WebAssembly] Encode block signatures as SLEB instead of ULEBDerek Schuff2017-04-171-0/+2
| | | | | | | | Use SLEB (varint) for block_type immediates in accordance with the spec. Patch by Yury Delendik llvm-svn: 300490
* Add GNU_discriminator support for inline callsites in llvm-symbolizer.Dehao Chen2017-04-172-3/+7
| | | | | | | | | | | | | | Summary: LLVM symbolize cannot recognize GNU_discriminator for inline callsites. This patch adds support for it. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32134 llvm-svn: 300486
* AMDGPU: Use MachineRegisterInfo to find max used registerMatt Arsenault2017-04-172-128/+77
| | | | | | | | | | Avoid looping through program to determine register counts. This avoids needing to look at regmask operands. Also fixes some counting errors with flat_scr when there are no stack objects. llvm-svn: 300482
OpenPOWER on IntegriCloud