summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [MemCpyOpt] Optimize double-storing by memset+memcpy.Ahmed Bougacha2015-04-171-3/+59
| | | | | | | | | | | | | | | | | | A common idiom in some code is to do the following: memset(dst, 0, dst_size); memcpy(dst, src, src_size); Some of the memset is redundant; instead, we can do: memcpy(dst, src, src_size); memset(dst + src_size, 0, dst_size <= src_size ? 0 : dst_size - src_size); Original patch by: Joel Jones Differential Revision: http://reviews.llvm.org/D498 llvm-svn: 235232
* AsmPrinter: Create a unified .debug_loc streamDuncan P. N. Exon Smith2015-04-178-86/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit removes `DebugLocList` and replaces it with `DebugLocStream`. - `DebugLocEntry` no longer contains its byte/comment streams. - The `DebugLocEntry` list for a variable/inlined-at pair is allocated on the stack, and released right after `DebugLocEntry::finalize()` (possible because of the refactoring in r231023). Now, only one list is in memory at a time now. - There's a single unified stream for the `.debug_loc` section that persists, stored in the new `DebugLocStream` data structure. The last point is important: this collapses the nested `SmallVector<>`s from `DebugLocList` into unified streams. We previously had something like the following: vec<tuple<Label, CU, vec<tuple<BeginSym, EndSym, vec<Value>, vec<char>, vec<string>>>>> A `SmallVector` can avoid allocations, but is statically fairly large for a vector: three pointers plus the size of the small storage, which is the number of elements in small mode times the element size). Nesting these is expensive, since an inner vector's size contributes to the element size of an outer one. (Nesting any vector is expensive...) In the old data structure, the outer vector's *element* size was 632B, excluding allocation costs for when the middle and inner vectors exceeded their small sizes. 312B of this was for the "three" pointers in the vector-tree beneath it. If you assume 1M functions with an average of 10 variable/inlined-at pairs each (in an LTO scenario), that's almost 6GB (besides inner allocations), with almost 3GB for the "three" pointers. This came up in a heap profile a little while ago of a `clang -flto -g` bootstrap, with `DwarfDebug::collectVariableInfo()` using something like 10-15% of the total memory. With this commit, we have: tuple<vec<tuple<Label, CU, Offset>>, vec<tuple<BeginSym, EndSym, Offset, Offset>>, vec<char>, vec<string>> The offsets are used to create `ArrayRef` slices of adjacent `SmallVector`s. This reduces the number of vectors to four (unrelated to the number of variable/inlined-at pairs), and caps the number of allocations at the same number. Besides saving memory and limiting allocations, this is NFC. I don't know my way around this code very well yet, but I wonder if we could go further: why stream to a side-table, instead of directly to the output stream? llvm-svn: 235229
* Compute A-B when A or B is weak.Rafael Espindola2015-04-175-26/+20
| | | | | | | | | | | | | | | | | | | | Similar to r235222, but for the weak symbol case. In an "ideal" assembler/object format an expression would always refer to the final value and A-B would only be computed from a section in the same comdat as A and B with A and B strong. Unfortunately that is not the case with debug info on ELF, so we need an heuristic. Since we need an heuristic, we may as well use the same one as gas: * call weak_sym : produces a relocation, even if in the same section. * A - weak_sym and weak_sym -A: don't produce a relocation if we can compute it. This fixes pr23272 and changes the fix of pr22815 to match what gas does. llvm-svn: 235227
* Remove dead code, NFCDuncan P. N. Exon Smith2015-04-171-8/+0
| | | | llvm-svn: 235225
* [AArch64] Avoid vector->load dependency cycles when creating LD1*post.Ahmed Bougacha2015-04-171-0/+7
| | | | | | | | They would break the SelectionDAG. Note that the opposite load->vector dependency is already obvious in: (LD1*post vec, ..) llvm-svn: 235224
* [WinEH] Reusing HandlerType entries leads to small CatchHigh valuesDavid Majnemer2015-04-171-1/+0
| | | | | | | | | CatchHigh may be smaller than TryHigh if we reuse an outlined catch handler for two different invokes with different EH states. We have no evidence which shows that CatchHigh must be greater than TryHigh or TryLow. We can revisit this if we turn out to be wrong. llvm-svn: 235223
* Compute A-B if both A and B are in the same comdat section.Rafael Espindola2015-04-171-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Part of pr23272. A small annoyance with the assembly syntax we implement is that given an expression there is no way to know if what is desired is the value of that expression for the symbols in this file or for the final values of those symbols in a link. The first case is useful for use in sections that get discarded or ignored if the section they are describing is discarded. For axample, consider A-B where A and B are in the same comdat section. We can compute the value of the difference in the section that is present in the current .o and if that section survives to the final DSO the value will still will be correct. But the section is in a comdat. Another section from another object file might be used istead. We know that that section will define A and B, but we have no idea what the value of A-B might be. In practice we have to assume that the intention is to compute the value in the current section since otherwise the is no way to create something like the debug aranges section. llvm-svn: 235222
* [opaque pointer types] Use the pointee type loaded from bitcode when ↵David Blaikie2015-04-172-7/+8
| | | | | | | | | | constructing a LoadInst Now (with a few carefully placed suppressions relating to general type serialization, etc) we can round trip a simple load through bitcode and textual IR without calling getElementType on a PointerType. llvm-svn: 235221
* Fix build errors introduced by r235215Pirama Arumuga Nainar2015-04-173-2/+6
| | | | | | | | | | | | | | | Summary: - Handle TypePromoteFloat in switch statements - Move an expression into an assert to avoid unused variable in non-assert builds. Reviewers: srhines, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9086 llvm-svn: 235220
* Add support to promote f16 to f32Pirama Arumuga Nainar2015-04-176-4/+527
| | | | | | | | | | | | | | Summary: This patch adds legalization support to operate on FP16 as a load/store type and do operations on it as floats. Tests for ARM are added to test/CodeGen/ARM/fp16-promote.ll Reviewers: srhines, t.p.northover Differential Revision: http://reviews.llvm.org/D8755 llvm-svn: 235215
* [mips][FastISel] Implement FastMaterializeAlloca in Mips fast-isel.Vasileios Kalintiris2015-04-171-0/+20
| | | | | | | | | | | | | | | | | | Summary: Implement the method FastMaterializeAlloca in Mips fast-isel Based on a patch by Reed Kotler. Test Plan: Passes test-suite at O0/O2 for mips32 r1/r2 fastalloca.ll Reviewers: dsanders, rkotler Subscribers: rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6742 llvm-svn: 235213
* [WinEH] Allow CatchHigh to be equal to TryHighDavid Majnemer2015-04-171-1/+1
| | | | | | | | Catch blocks which are empty may be in the same state as their try blocks. It is not meaningful to give the catch block its own state number in this case because it can't do anything exceptional. llvm-svn: 235212
* [LTO API] add lto_codegen_set_should_internalize.Manman Ren2015-04-171-1/+2
| | | | | | | | | | | When debugging LTO issues with ld64, we use -save-temps to save the merged optimized bitcode file, then invoke ld64 again on the single bitcode file. The saved bitcode file is already internalized, so we can call lto_codegen_set_should_internalize and skip running internalization again. rdar://20227235 llvm-svn: 235211
* [X86, AVX] add an exedepfix entry for vmovq == vmovlps == vmovlpdSanjay Patel2015-04-171-1/+1
| | | | | | | | | | This is the AVX extension of r235014: http://llvm.org/viewvc/llvm-project?view=revision&revision=235014 Review: http://reviews.llvm.org/D8691 llvm-svn: 235210
* AsmPrinter: Store MDExpression directly instead of MDNode, NFCDuncan P. N. Exon Smith2015-04-172-10/+8
| | | | | | | Clean up `DebugLocEntry::Value::Expression`'s type while I'm messing around in here anyway. llvm-svn: 235203
* AsmPrinter: Stop storing MDLocalVariable in DebugLocEntryDuncan P. N. Exon Smith2015-04-172-31/+19
| | | | | | | | | | Stop storing the `MDLocalVariable` in the `DebugLocEntry::Value`s. We generate the list of `DebugLocEntry`s separately for each variable/inlined-at pair, so the variable never actually changes here. This is effectively NFC (aside from saving some memory and CPU time). llvm-svn: 235202
* AsmPrinter: Calculate type upfront for location lists, NFCDuncan P. N. Exon Smith2015-04-172-15/+15
| | | | | | | | | We can calculate the variable type up front before calling `DebugLocEntry::finalize()`. In fact, since we only care about the type if it's an `MDBasicType`, don't even bother resolving it using the type identifier map. llvm-svn: 235201
* [opaque pointer type] Serialize the type of an llvm::Function as a function ↵David Blaikie2015-04-172-5/+4
| | | | | | type rather than a function pointer type llvm-svn: 235200
* Add support for v1i128 type.Kit Barton2015-04-171-0/+2
| | | | | | | | | | | | The v1i128 type is needed for the quadword add/substract instructions introduced in POWER8. Futhermore, the PowerPC ABI specifies that parameters of type v1i128 are to be passed in a single vector register, while parameters of type i128 are passed in pairs of GPRs. Thus, it is necessary to be able to differentiate between v1i128 and i128 in LLVM. http://reviews.llvm.org/D8564 llvm-svn: 235198
* Add the i128 builtin type to LLVM.Kit Barton2015-04-174-2/+10
| | | | | | | | | | | The i128 type is needed as a builtin type in order to support the v1i128 vector type. The PowerPC ABI requires that the i128 and v1i128 types are handled differently when passed as parameters to functions (i128 is passed in pairs of GPRs, v1i128 is passed in a single vector register). http://reviews.llvm.org/D8564 llvm-svn: 235196
* [mips][FastISel] Implement shift ops for Mips fast-isel.Vasileios Kalintiris2015-04-171-0/+80
| | | | | | | | | | | | | | | | | | Summary: Add shift operators implementation to fast-isel for Mips. These are shift ops for non legal forms, i.e. i8 and i16. Based on a patch by Reed Kotler. Test Plan: Reviewers: dsanders Subscribers: echristo, rfuhler, llvm-commits Differential Revision: http://reviews.llvm.org/D6726 llvm-svn: 235194
* Fix TRUNCATE splitting helper logic.James Molloy2015-04-172-11/+15
| | | | | | | | | This is a followon to r233681 - I'd misunderstood the semantics of FTRUNC, and had confused it with (FP_ROUND ..., 0). Thanks for Ahmed Bougacha for his post-commit review! llvm-svn: 235191
* Move AliasedSymbol to MachObjectWriter.Rafael Espindola2015-04-173-16/+16
| | | | | | | It was only used by MachO. Part of pr19627. llvm-svn: 235185
* Revert r235177 as the Handle is used to fail GetExitCodeProcess on purpose.Yaron Keren2015-04-171-1/+3
| | | | | | | Avoid double closing of the handle by testing GetLastErr for ERROR_INVALID_HANDLE and not calling CloseHandle(PI.ProcessHandle) then. llvm-svn: 235184
* [mips] Teach the delay slot filler to remove needless KILL instructions.Vasileios Kalintiris2015-04-171-11/+30
| | | | | | | | | | | | | | | | | | Summary: Previously, the presence of KILL instructions would block valid candidates from filling a specific delay slot. With the elimination of the KILL instructions, in the appropriate range, we are able to fill more slots and keep the information from future def/use analysis consistent. Reviewers: dsanders Reviewed By: dsanders Subscribers: hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D7724 llvm-svn: 235183
* Add a proper fix for pr23025.Rafael Espindola2015-04-171-4/+16
| | | | | | | Instead of avoiding looking past every global symbol, only do so if the symbol is in a comdat. llvm-svn: 235181
* [mc] Clean up emission of byte sequencesBenjamin Kramer2015-04-177-26/+7
| | | | | | No functional change intended. llvm-svn: 235178
* Eliminate superfluous CloseHandle(PI.ProcessHandle).Yaron Keren2015-04-171-1/+0
| | | | | | | This handle will always be closed few lines later, resulting in an error for the second CloseHandle. llvm-svn: 235177
* [mips] Move ABI-dependent register selections to MipsABIInfo. NFC.Daniel Sanders2015-04-176-49/+84
| | | | | | | | | | | | | | | | | | | | | Summary: For example, a common idiom was 'isN64 ? Mips::SP_64 : Mips::SP'. This has been moved to MipsABIInfo and replaced with 'ABI.GetStackPtr()'. There are others that should also be moved. This patch sticks to the ones that are obviously non-functional. The others have minor mistakes that need fixing at the same time, mostly involving checks for 64-bit GPR's instead of checks for 64-bit pointers. Reviewers: tomatabacu Reviewed By: tomatabacu Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8972 llvm-svn: 235173
* Revert r235154-r235156, they cause asserts when building win64 code ↵Nico Weber2015-04-175-118/+119
| | | | | | (http://crbug.com/477988) llvm-svn: 235170
* Don't walk aliases from global to local symbols in comdats.Rafael Espindola2015-04-171-1/+30
| | | | | | This fixes pr23196. llvm-svn: 235167
* Write relocation sections contiguously.Rafael Espindola2015-04-171-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linkers normally read all the relocations upfront to compute the references between sections. Putting them together is a bit more cache friendly. I benchmarked linking a Release+Asserts clang with gold on a vm. I tried all 4 combinations of --gc-sections/no --gc-section hot and cold cache. I cleared the cache with echo 3 > /proc/sys/vm/drop_caches and warmed it up by running the link once before timing the subsequent ones. With cold cache and --gc-sections the time goes from 1.86130781665 +- 0.01713126697463843 seconds to 1.82370735105 +- 0.014127522318814516 seconds With cold cache and no --gc-sections the time goes from 1.6087245435500002 +- 0.012999066825178644 seconds to 1.5687122041500001 +- 0.013145850126026619 seconds With hot cache and no --gc-sections the time goes from 0.926200939 ( +- 0.33% ) seconds to 0.907200079 ( +- 0.31% ) seconds With hot cache and gc sections the time goes from 1.183038049 ( +- 0.34% ) seconds to 1.147355862 ( +- 0.39% ) seconds llvm-svn: 235165
* [opaque pointer type] Explicit pointee type for call instructionDavid Blaikie2015-04-172-12/+25
| | | | | | | | | | Use an extra bit in the CCInfo to flag the newer version of the instructiont hat includes the type explicitly. Tested the newer error cases I added, but didn't add tests for the finer granularity improvements to existing error paths. llvm-svn: 235160
* Fix unused variable warningReid Kleckner2015-04-171-5/+0
| | | | llvm-svn: 235155
* [SEH] Reimplement x64 SEH using WinEHPrepareReid Kleckner2015-04-175-114/+118
| | | | | | | | | | | | | | | | This now emits simple, unoptimized xdata tables for __C_specific_handler based on the handlers listed in @llvm.eh.actions calls produced by WinEHPrepare. This adds support for running __finally blocks when exceptions are thrown, and removes the old landingpad fan-in codepath. I ran some manual execution tests on small basic test cases with and without optimization, as well as on Chrome base_unittests, which uses a small amount of SEH. I'm sure there are bugs, and we may need to revert. llvm-svn: 235154
* [NaryReassociate] run NaryReassociate iterativelyJingyue Wu2015-04-171-7/+47
| | | | | | | | | | | | | | | | | | | | | | | Summary: An alternative is to use a worklist approach. However, that approach would break the traversing order so that we couldn't lookup SeenExprs efficiently. I don't see a clear winner here, so I picked the easier approach. Along with two minor improvements: 1. preserves ScalarEvolution by forgetting instructions replaced 2. removes dead code locally avoiding the need of running DCE afterwards Test Plan: add to slsr-add.ll a test that requires multiple iterations Reviewers: broune, dberlin, atrick, meheff Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9058 llvm-svn: 235151
* [AArch64] Don't assert on f16 in DUP PerfectShuffle generator.Ahmed Bougacha2015-04-161-1/+1
| | | | | | | | Found by code inspection, but breaking i16 at least breaks other tests. They aren't checking this in particular though, so also add some explicit tests for the already working types. llvm-svn: 235148
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-04-162-17/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()*" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () * %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)') addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$") func_end = re.compile("(?:void.*|\)\s*)\*$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145
* DebugInfo: Fix UserValue::match() in LiveDebugVariables after r235050Duncan P. N. Exon Smith2015-04-161-5/+5
| | | | | | | | r235050 dropped the inlined-at field from `MDLocalVariable`, deferring to the `!dbg` attachments. Fix `UserValue` to take the `!dbg` into account when differentiating between variables. llvm-svn: 235140
* AsmPrinter: Remove dead code, NFCDuncan P. N. Exon Smith2015-04-161-1/+0
| | | | llvm-svn: 235139
* AsmPrinter: Simplify logic for debug info intrinsics' !dbg attachmentsDuncan P. N. Exon Smith2015-04-163-11/+6
| | | | | | These are required, so just assume they're there. llvm-svn: 235138
* Disable AArch64 fast-isel on big-endian call vector returns.Pete Cooper2015-04-161-0/+5
| | | | | | | | A big-endian vector return needs a byte-swap which we aren't doing right now. For now just bail on these cases to get correctness back. llvm-svn: 235133
* [IR] Introduce a dereferenceable_or_null(N) attribute.Sanjoy Das2015-04-1610-30/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If a pointer is marked as dereferenceable_or_null(N), LLVM assumes it is either `null` or `dereferenceable(N)` or both. This change only introduces the attribute and adds a token test case for the `llvm-as` / `llvm-dis`. It does not hook up other parts of the optimizer to actually exploit the attribute -- those changes will come later. For pointers in address space 0, `dereferenceable(N)` is now exactly equivalent to `dereferenceable_or_null(N)` && `nonnull`. For other address spaces, `dereferenceable(N)` is potentially weaker than `dereferenceable_or_null(N)` && `nonnull` (since we could have a null `dereferenceable(N)` pointer). The motivating case for this change is Java (and other managed languages), where pointers are either `null` or dereferenceable up to some usually known-at-compile-time constant offset. Reviewers: rafael, hfinkel Reviewed By: hfinkel Subscribers: nicholas, llvm-commits Differential Revision: http://reviews.llvm.org/D8650 llvm-svn: 235132
* [NaryReassociate] speeds up candidate searchingJingyue Wu2015-04-161-9/+15
| | | | | | | | | | | | | | | | | | | | | Summary: This fixes a left-over efficiency issue in D8950. As Andrew and Daniel suggested, we can store the candidates in a stack and pop the top element when it does not dominate the current instruction. This reduces the worst-case time complexity to O(n). Test Plan: a new test in nary-add.ll that exercises this optimization. Reviewers: broune, dberlin, meheff, atrick Reviewed By: atrick Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D9055 llvm-svn: 235129
* [X86, SSE] instcombine common cases of insertps intrinsics into shufflesSanjay Patel2015-04-161-2/+45
| | | | | | | | | | | | | | | | This is very similar to D8486 / r232852 (vperm2). If we treat insertps intrinsics as shufflevectors, we can optimize them better. I've left all but the full zero case of the zero mask variants out of this patch. I don't think those can be converted into a single shuffle in all cases, but I'd be happy to be proven wrong as I was for vperm2f128. Either way, we'd need to support whatever sequence we come up with for those cases in the backend before converting them here. Differential Revision: http://reviews.llvm.org/D8833 llvm-svn: 235124
* [WinEH] Handle a landingpad, resume, and cleanup all rolled into a BBReid Kleckner2015-04-161-6/+4
| | | | | | This happens a lot with simple cleanups after SimplifyCFG. llvm-svn: 235117
* DebugInfo: Allow DebugLocs to be constructed from constDuncan P. N. Exon Smith2015-04-161-5/+7
| | | | | | | Allow `const`-qualified pointers to be used to construct `DebugLoc`s, as a convenience. llvm-svn: 235115
* DebugInfo: Remove DIDescriptor from the DIBuilder APIDuncan P. N. Exon Smith2015-04-161-241/+209
| | | | | | | | | | | | | | | | As a step toward killing `DIDescriptor` and its subclasses, remove it from the `DIBuilder` API. Replace the subclasses with appropriate pointers from the new debug info hierarchy. There are a couple of possible surprises in type choices for out-of-tree frontends: - Subroutine types: `MDSubroutineType`, not `MDCompositeTypeBase`. - Composite types: `MDCompositeType`, not `MDCompositeTypeBase`. - Scopes: `MDScope`, not `MDNode`. - Generic debug info nodes: `DebugNode`, not `MDNode`. This is part of PR23080. llvm-svn: 235111
* Revert the switch lowering change (r235101, r235103, r235106)Hans Wennborg2015-04-163-897/+760
| | | | | | Looks like it broke the sanitizer-ppc64-linux1 build. Reverting for now. llvm-svn: 235108
* [AArch64] Add v8.1a "Virtualization Host Extensions"Vladimir Sukharev2015-04-162-1/+59
| | | | | | | | | | | | Reviewers: t.p.northover Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8500 Patch by: Tom Coxon llvm-svn: 235107
OpenPOWER on IntegriCloud