summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [CGP] Re-enable Select in complex addressing mode.Serguei Katkov2018-01-261-1/+1
| | | | | | Switch Select handling on after fixing two bugs: rL323192 and rL323497. llvm-svn: 323498
* [CodeGen] Ignore private symbols in llvm.used for COFFShoaib Meenai2018-01-261-4/+4
| | | | | | | Similar to the existing handling for internal symbols, private symbols are also not visible to the linker and should be ignored. llvm-svn: 323483
* [GISel]: Implement GlobalISel combiner API.Aditya Nandakumar2018-01-253-0/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/D41373 The various components are GICombinerHelper contains transformations that are common to all targets. Targets can pick and choose which transformations (at function/opcode granularity) each pass uses via configuring a GICombinerInfo. GICombiner contains some common code and it does the traversal, driving of combines, worklist management and iterating until convergence. GICombinerInfo is an interface with a virtual method called combine. The combiner info will allow targets to pick and choose (or implement their own specific combines). CombineInfos can make use of available combines in GICombineHelper to configure the transformations for a particular pass. Currently this approach allows cherry picking transformations from helpers (at function/opcode granularity) and also allows early returning on specific transformations. Targets also get to prioritize whether target specific combines run before/after the opt-in generic combines. Ideally we would like this part to be configured by both C++ and Tablegen. The CombinerInfo also has a field which indicates how to deal with IllegalOps (ie - should we allow to create them/or legalize them?). A CombinerPass would configure a CombinerInfo, create the GICombiner with the Info, and call GICombiner::combineMachineInstrs(MachineFunction&). This organization is very similar to the GISelLegalizer. llvm-svn: 323392
* [GlobalISel] Don't fall back to FastISel.Amara Emerson2018-01-242-1/+5
| | | | | | | Apparently checking the pass structure isn't enough to ensure that we don't fall back to FastISel, as it's set up as part of the SelectionDAGISel. llvm-svn: 323369
* [globalisel] Introduce LegalityQuery to better encapsulate the legalizer ↵Daniel Sanders2018-01-242-14/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | decisions. NFC. Summary: `getAction(const InstrAspect &) const` breaks encapsulation by exposing the smaller components that are used to decide how to legalize an instruction. This is a problem because we need to change the implementation of LegalizerInfo so that it's able to describe particular type combinations rather than just cartesian products of types. For example, declaring the following setAction({..., 0, s32}, Legal) setAction({..., 0, s64}, Legal) setAction({..., 1, s32}, Legal) setAction({..., 1, s64}, Legal) currently declares these type combinations as legal: {s32, s32} {s64, s32} {s32, s64} {s64, s64} but we currently have no means to say that, for example, {s64, s32} is not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/ G_UNMERGE_VALUES has relationships between the types that are currently described incorrectly. Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics differently to atomics. The necessary information is in the MMO but we have no way to use this in the legalizer. Similarly, there is currently no way for the register type and the memory type to differ so there is no way to cleanly represent extending-load/truncating-store in a way that can't be broken by optimizers (resulting in illegal MIR). This patch introduces LegalityQuery which provides all the information needed by the legalizer to make a decision on whether something is legal and how to legalize it. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner Reviewed By: bogner Subscribers: bogner, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D42244 llvm-svn: 323342
* [DebugInfo] Emit DWARF reference for DIVariable 'count' in DISubrangeSander de Smalen2018-01-242-1/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements the codegen of DWARF debug info for non-constant 'count' fields for DISubrange. This is patch [2/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41696 llvm-svn: 323323
* [Metadata] Extend 'count' field of DISubrange to take a metadata nodeSander de Smalen2018-01-242-2/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch extends the DISubrange 'count' field to take either a (signed) constant integer value or a reference to a DILocalVariable or DIGlobalVariable. This is patch [1/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: rnk, probinson, fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41695 llvm-svn: 323313
* [DAGCombiner] Bail out if vector size is not a multipleSven van Haastregt2018-01-241-0/+4
| | | | | | | | | | | | For the included test case, the DAG transformation concat_vectors(scalar, undef) -> scalar_to_vector(sclr) would attempt to create a v2i32 vector for a v9i8 concat_vector. Bail out to avoid creating a bitcast with mismatching sizes later on. Differential Revision: https://reviews.llvm.org/D42379 llvm-svn: 323312
* [GlobalMerge] Don't merge dllexport globalsMartin Storsjo2018-01-241-1/+2
| | | | | | | | | Merging such globals loses the dllexport attribute. Add a test to check that normal globals still are merged. Differential Revision: https://reviews.llvm.org/D42127 llvm-svn: 323307
* [GISel]: Remove redundant copies at the end of ISelAditya Nandakumar2018-01-241-0/+32
| | | | | | | | | https://reviews.llvm.org/D42402 A lot of these copies are useless (copies b/w VRegs having the same regclass) and should be cleaned up. llvm-svn: 323291
* [safestack] Inline safestack pointer access when possible.Evgeniy Stepanov2018-01-231-1/+50
| | | | | | | | | | | | | | | Summary: This adds an -mllvm flag that forces the use of a runtime function call to get the unsafe stack pointer, the same that is currently used on non-x86, non-aarch64 android. The call may be inlined. Reviewers: pcc Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37405 llvm-svn: 323259
* CodeGen: Fix assertion in ScheduleDAGMILive::scheduleMI due to llvm.dbg.valueYaxun Liu2018-01-231-0/+1
| | | | | | | | | | Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value. This issues causes amdgcn target to assert when compiling some user codes with -g. Differential Revision: https://reviews.llvm.org/D42394 llvm-svn: 323214
* [CGP] Fix the GV handling in complex addressing modeSerguei Katkov2018-01-231-15/+21
| | | | | | | | | | | | | | | If in complex addressing mode the difference is in GV then base reg should not be installed because we plan to use base reg as a merge point of different GVs. This is a fix for PR35980. Reviewers: reames, john.brawn, santosh Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42230 llvm-svn: 323192
* Introduce the "retpoline" x86 mitigation technique for variant #2 of the ↵Chandler Carruth2018-01-225-0/+230
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155
* Fixing warnings caused by commit 323095Marina Yatsina2018-01-223-10/+10
| | | | | Change-Id: I4e1f81db2f5382a820f4016c23b243e4d5aebf51 llvm-svn: 323114
* Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their ↵Marina Yatsina2018-01-225-417/+547
| | | | | | | | | | | | | | | | | | own files. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40333 Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56 llvm-svn: 323095
* Rename ExecutionDepsFix files to ExecutionDomainFixMarina Yatsina2018-01-222-3/+3
| | | | | | | | | | | | | | | | This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40332 Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8 llvm-svn: 323094
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-53/+67
| | | | | | | | | | | | | | | | | | - clang-format This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I131b126af13bc743bc5d69d83699e52b9b720979 llvm-svn: 323093
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-66/+0
| | | | | | | | | | | | | | | | | | | | - Moving comments to class definition in header file - Changing comments to doxygen style - Rephrase loop traversal explaining comment This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I9a12618db5b66128611fa71b54a233414f6012ac llvm-svn: 323092
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-56/+49
| | | | | | | | | | | | | | | | | | - Removing LiveRegs This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I8ab56d99951a6d6981542f68d94c1f624f3c9fbf llvm-svn: 323091
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-50/+40
| | | | | | | | | | | | | | | | | | - Changing LiveRegs to be a vector This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I9cdd364bd7bf2a0bf61ea41a48d4bd310ec3bce4 llvm-svn: 323090
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-32/+38
| | | | | | | | | | | | | | | | | | | | | | - Changing DenseMap<MBB*, LiveReg*> to SmallVector<LiveReg*> - Now the MBB number will be the index of LiveReg in the vector. - Adding asserts This patch is NFC. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: If4a3f141693d0361ddb292432337dbb63a1e69ee llvm-svn: 323089
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-73/+41
| | | | | | | | | | | | | | | | | | | | | | | - Remove unneeded includes and unneeded members - Use range iterators - Variable renaming, typedefs, extracting constants - Removing {} from one line ifs This patch is NFC. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: Ib59060ab3fa5bee3bf2ca2045c24e572635ee7f6 llvm-svn: 323088
* Separate ExecutionDepsFix into 4 parts:Marina Yatsina2018-01-221-142/+371
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains). 2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings. 3. BreakFalseDeps - Breaks false dependencies. 4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks. This also included the following changes to ExcecutionDepsFix original logic: 1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class. 2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class. Additional changes in affected files: 1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate. 2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate. Additional refactoring changes will follow. This commit is (almost) NFC. The only functional change is that now BreakFalseDeps will break dependency for all register classes. Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet. In a future commit several instructions (and tests) will be added. This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40330 Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275 llvm-svn: 323087
* [SelectionDAG] Fix codegen of vector stores with non byte-sized elements.Jonas Paulsson2018-01-203-6/+38
| | | | | | | | | | | | | | This was completely broken, but hopefully fixed by this patch. In cases where it is needed, a vector with non byte-sized elements is stored by extracting, zero-extending, shift:ing and or:ing the elements into an integer of the same width as the vector, which is then stored. Review: Eli Friedman, Ulrich Weigand https://reviews.llvm.org/D42100#inline-369520 https://bugs.llvm.org/show_bug.cgi?id=35520 llvm-svn: 323042
* CodeGen: handle llvm.used properly for COFFSaleem Abdulrasool2018-01-202-0/+35
| | | | | | | | | `llvm.used` contains a list of pointers to named values which the compiler, assembler, and linker are required to treat as if there is a reference that they cannot see. Ensure that the symbols are preserved by adding an explicit `-include` reference to the linker command. llvm-svn: 323017
* Add optional DICompileUnit to DIBuilder + make outliner debug info use itJessica Paquette2018-01-191-39/+68
| | | | | | | | | | | | | | | | | | | | | | Previously, the DIBuilder didn't expose functionality to set its compile unit in any other way than calling createCompileUnit. This meant that the outliner, which creates new functions, had to create a new compile unit for its debug info. This commit adds an optional parameter in the DIBuilder's constructor which lets you set its CU at construction. It also changes the MachineOutliner so that it keeps track of the DISubprograms for each outlined sequence. If debugging information is requested, then it uses one of the outlined sequence's DISubprograms to grab a CU. It then uses that CU to construct the DISubprogram for the new outlined function. The test has also been updated to reflect this change. See https://reviews.llvm.org/D42254 for more information. Also see the e-mail discussion on D42254 in llvm-commits for more context. llvm-svn: 322992
* [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS ↵Ulrich Weigand2018-01-191-0/+1
| | | | | | | | | | | | boolean return value The second return value of ATOMIC_CMP_SWAP_WITH_SUCCESS is known to be a boolean, and should therefore be treated by computeKnownBits just like the second return values of SMULO / UMULO. Differential Revision: https://reviews.llvm.org/D42067 llvm-svn: 322985
* Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵Daniel Neilson2018-01-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
* [CodeGen] Unify printing format of debug-location in both MIR and -debugFrancis Visoiu Mistrih2018-01-192-9/+14
| | | | | | Use "debug-location" instead of "; dbg:" in MI::print. llvm-svn: 322936
* Split MachineLICM into EarlyMachineLICM and MachineLICM; NFCMatthias Braun2018-01-193-64/+79
| | | | | | | | | | | | | This avoids playing games with pseudo pass IDs and avoids using an unreliable MRI::isSSA() check to determine whether register allocation has happened. Note that this renames: - MachineLICMID -> EarlyMachineLICM - PostRAMachineLICMID -> MachineLICMID to be consistent with the EarlyTailDuplicate/TailDuplicate naming. llvm-svn: 322927
* Split TailDuplicatePass into pre- and post-RA variant; NFCMatthias Braun2018-01-193-27/+39
| | | | | | | | Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to determine pre-/post-RA state. llvm-svn: 322926
* Revert [CGP] Re-enable Select in complex addressing modeSerguei Katkov2018-01-191-1/+1
| | | | | | One of buildbots failed. Revert for now till fix the issue. llvm-svn: 322923
* AArch64: Fix emergency spillslot being out of reach for large callframesMatthias Braun2018-01-192-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-commit of r322200: The testcase shouldn't hit machineverifiers anymore with r322917 in place. Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322919
* [CodeGen][NFC] Rename IsVerbose to IsStandalone in Machine*::printFrancis Visoiu Mistrih2018-01-186-17/+18
| | | | | | | | Committed r322867 too soon. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322868
* [CodeGen] Print RegClasses on MI in verbose modeFrancis Visoiu Mistrih2018-01-186-21/+24
| | | | | | | | | | | | | r322086 removed the trailing information describing reg classes for each register. This patch adds printing reg classes next to every register when individual operands/instructions/basic blocks are printed. In the case of dumping MIR or printing a full function, by default don't print it. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322867
* [TargetLowering] add punctuation for readability; NFCSanjay Patel2018-01-181-1/+1
| | | | llvm-svn: 322855
* [CodeGen][NFC] Refactor MachineInstr::printFrancis Visoiu Mistrih2018-01-181-21/+45
| | | | | | | * Handle more cases where the MI is not attached yet * Add similar asserts like in MIRPrinter::print llvm-svn: 322848
* [SelectionDAG] Convert assert to condtionSam Parker2018-01-181-3/+2
| | | | | | | | | Follow-up to r322120 which can cause assertions for AArch64 because v1f64 and v1i64 are legal types. Differential Revision: https://reviews.llvm.org/D42097 llvm-svn: 322823
* [DAGCombiner] Add a DAG combine to turn a splat build_vector where the splat ↵Craig Topper2018-01-181-0/+23
| | | | | | | | | | elemnt is a bitcast from a vector type into a concat_vector For example, a build_vector of i64 bitcasted from v2i32 can be turned into a concat_vectors of the v2i32 vectors with a bitcast to a vXi64 type Differential Revision: https://reviews.llvm.org/D42090 llvm-svn: 322811
* GlobalISel: Make MachineCSE runnable in the middle of the GlobalISelJustin Bogner2018-01-182-14/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, it is not possible to run MachineCSE in the middle of the GlobalISel pipeline. Being able to run generic optimizations between the core passes of GlobalISel was one of the goals of the new ISel framework. This is the first attempt to do it. The problem is that MachineCSE pass assumes all register operands have a register class, which, in GlobalISel context, won't be true until after the InstructionSelect pass. The reason for this behaviour is that before replacing one virtual register with another, MachineCSE pass (and most of the other optimization machine passes) must check if the virtual registers' constraints have a (sufficiently large) intersection, and constrain the resulting register appropriately if such intersection exists. GlobalISel extends the representation of such constraints from just a register class to a triple (low-level type, register bank, register class). This commit adds MachineRegisterInfo::constrainRegAttrs method that extends MachineRegisterInfo::constrainRegClass to such a triple. The idea is that going forward we should use: - RegisterBankInfo::constrainGenericRegister within GlobalISel's InstructionSelect pass - MachineRegisterInfo::constrainRegClass within SelectionDAG ISel - MachineRegisterInfo::constrainRegAttrs everywhere else regardless the target and instruction selector it uses. Patch by Roman Tereshin. Thanks! llvm-svn: 322805
* Fix the failure caused by r322773Volkan Keles2018-01-181-8/+3
| | | | | | Do not run GlobalISel if `-fast-isel=0 -global-isel=false`. llvm-svn: 322800
* [MachineOutliner] Add DISubprograms to outlined functions.Jessica Paquette2018-01-181-2/+47
| | | | | | | | | | Before, it wasn't possible to get backtraces inside outlined functions. This commit adds DISubprograms to the IR functions created by the outliner which makes this possible. Also attached a test that ensures that the produced debug information is correct. This is useful to users that want to debug outlined code. llvm-svn: 322789
* [CodeGen] Hoist common AsmPrinter code out of X86, ARM, and AArch64Reid Kleckner2018-01-171-0/+16
| | | | | | | | | | | Every known PE COFF target emits /EXPORT: linker flags into a .drective section. The AsmPrinter should handle this. While we're at it, use global_values() and emit each export flag with its own .ascii directive. This should make the .s file output more readable. llvm-svn: 322788
* Add a TargetOption to enable/disable GlobalISelVolkan Keles2018-01-171-15/+14
| | | | | | | | | | | | | | | | | | | | | Summary: This patch adds a new target option in order to control GlobalISel. This will allow the users to enable/disable GlobalISel prior to the backend by calling `TargetMachine::setGlobalISel(bool Enable)`. No test case as there is already a test to check GlobalISel command line options. See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll. Reviewers: qcolombet, aemerson, ab, dsanders Reviewed By: qcolombet Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42137 llvm-svn: 322773
* Add support for emitting libcalls for x86_fp80 -> fp128 and vice-versaBenjamin Kramer2018-01-171-0/+6
| | | | | | compiler_rt doesn't provide them (yet), but libgcc does. PR34076. llvm-svn: 322772
* [LegalizeDAG] Fix ATOMIC_CMP_SWAP_WITH_SUCCESS legalization.Eli Friedman2018-01-171-2/+2
| | | | | | | | | | | | | The code wasn't zero-extending correctly, so the comparison could spuriously fail. Adds some AArch64 tests to cover this case. Inspired by D41791. Differential Revision: https://reviews.llvm.org/D41798 llvm-svn: 322767
* [GISel] Make constrainSelectedInstRegOperands() available to the legalizer. NFCAditya Nandakumar2018-01-172-44/+45
| | | | | | https://reviews.llvm.org/D42149 llvm-svn: 322743
* [ARM GlobalISel] Legalize G_FPEXT and G_FPTRUNCDiana Picus2018-01-171-0/+47
| | | | | | | | | | | Mark G_FPEXT and G_FPTRUNC as legal or libcall, depending on hardware support, but only for conversions between float and double. Also add the necessary boilerplate so that the LegalizerHelper can introduce the required libcalls. This also works only for float and double, but isn't too difficult to extend when the need arises. llvm-svn: 322651
* [NFC] fix trivial typos in commentsHiroshi Inoue2018-01-174-7/+7
| | | | | | "the the" -> "the" llvm-svn: 322636
OpenPOWER on IntegriCloud