summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [Support] Move DJB hash to support. NFCJonas Devlieghere2018-01-281-1/+2
| | | | | | | | | | | This patch moves the DJB hash to support. This is consistent with other hashing algorithms living there. The hash is used by the DWARF accelerator tables. We're doing this now because the hashing function is needed by dsymutil and we don't want to link against libBinaryFormat. Differential revision: https://reviews.llvm.org/D42594 llvm-svn: 323616
* [TargetLowering] Teach TargetLowering::SimplifySetCC to simplify setcc of ↵Craig Topper2018-01-271-14/+16
| | | | | | | | vXi1 vectors into logic ops. This transform was already being done for setcc of scalar i1. This extends it to vectors. llvm-svn: 323585
* [SelectionDAG] Make DAGTypeLegalizer::PromoteSetCCOperands handle ↵Craig Topper2018-01-271-4/+4
| | | | | | | | SETEQ/SETNE correctly for vector types. The code was using getValueSizeInBits and combining with the result of a call to DAG.ComputeNumSignBits. But for vector types getValueSizeInBits returns the width of the full vector while ComputeNumSignBits is going to give a number no larger than the width of a single element. So we should be using getScalarValueSizeInBits to get the element width. llvm-svn: 323583
* [GlobalISel][Legalizer] Convert the FP constants to the right APFloat type ↵Amara Emerson2018-01-271-1/+18
| | | | | | | | | | | for G_FCONSTANT. We weren't converting the immediate ConstantFP during legalization, which caused the wrong bit patterns to be emitted for half type FP constants. Fixes PR36106. llvm-svn: 323582
* [LivePhysRegs] Preserve pristine regs in blocks with no successors.Eli Friedman2018-01-261-2/+2
| | | | | | | | | | | | | | | One common source of blocks with no successors is calls to noreturn functions; we want to preserve pristine registers in case they throw an exception. The whole pristine register thing is messy (we should really prefer to explicitly model registers), but this fills a hole in the model for now. Fixes https://bugs.llvm.org/show_bug.cgi?id=36073. Differential Revision: https://reviews.llvm.org/D42509 llvm-svn: 323559
* [SelectionDAGISel] Add a debug print before call to Select. Adjust where ↵Craig Topper2018-01-261-6/+5
| | | | | | | | | | | | blank lines are printed during isel process to make things more sensibly grouped. Previously some targets printed their own message at the start of Select to indicate what they were selecting. For the targets that didn't, it means there was no print of the root node before any custom handling in the target executed. So if the target did something custom and never called SelectNodeCommon, no print would be made. For the targets that did print a message in Select, if they didn't custom handle a node SelectNodeCommon would reprint the root node before walking the isel table. It seems better to just print the message before the call to Select so all targets behave the same. And then remove the root node printing from SelectNodeCommon and just leave a message that says we're starting the table search. There were also some oddities in blank line behavior. Usually due to a \n after a call to SelectionDAGNode::dump which already inserted a new line. llvm-svn: 323551
* [DWARF] Generate DWARF v5 string offsets tables along with strx* index forms.Wolfgang Pieb2018-01-269-38/+181
| | | | | | | | | | | | Summary: This is the producer side for DWARF v5 string offsets tables. The reader/consumer side was committed with r321295. All compile and type units in a module share a contribution to the string offsets table. Indirect strings use the strx{1,2,3,4} index forms. Reviewers: dblaikie, aprantl, JDevliegehere Differential Revision: https://reviews.llvm.org/D42021 llvm-svn: 323546
* [DAG] Teach findBaseOffset to interpret indexes of indexed memory operationsNirav Dave2018-01-261-8/+35
| | | | | | Indexed outputs are addition / subtractions and can be interpreted as such. llvm-svn: 323539
* [DAGCombine] reduceBuildVecToShuffle - ensure EXTRACT_VECTOR_ELT index is in ↵Simon Pilgrim2018-01-261-1/+5
| | | | | | | | range From OSS Fuzz Test Case #5688 llvm-svn: 323535
* [MIR] Add support for addrspace in MIRFrancis Visoiu Mistrih2018-01-264-0/+20
| | | | | | | | | | Add support for printing / parsing the addrspace of a MachineMemOperand. Fixes PR35970. Differential Revision: https://reviews.llvm.org/D42502 llvm-svn: 323521
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-264-4/+4
| | | | | | "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508
* [SelectionDAG] Replace a std::vector<SDValue> with a SmallVector.Craig Topper2018-01-261-1/+1
| | | | | | It likely the number of elements in the type we're legalizing here is reasonably small. llvm-svn: 323505
* [CGP] Re-enable Select in complex addressing mode.Serguei Katkov2018-01-261-1/+1
| | | | | | Switch Select handling on after fixing two bugs: rL323192 and rL323497. llvm-svn: 323498
* [CodeGen] Ignore private symbols in llvm.used for COFFShoaib Meenai2018-01-261-4/+4
| | | | | | | Similar to the existing handling for internal symbols, private symbols are also not visible to the linker and should be ignored. llvm-svn: 323483
* [GISel]: Implement GlobalISel combiner API.Aditya Nandakumar2018-01-253-0/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/D41373 The various components are GICombinerHelper contains transformations that are common to all targets. Targets can pick and choose which transformations (at function/opcode granularity) each pass uses via configuring a GICombinerInfo. GICombiner contains some common code and it does the traversal, driving of combines, worklist management and iterating until convergence. GICombinerInfo is an interface with a virtual method called combine. The combiner info will allow targets to pick and choose (or implement their own specific combines). CombineInfos can make use of available combines in GICombineHelper to configure the transformations for a particular pass. Currently this approach allows cherry picking transformations from helpers (at function/opcode granularity) and also allows early returning on specific transformations. Targets also get to prioritize whether target specific combines run before/after the opt-in generic combines. Ideally we would like this part to be configured by both C++ and Tablegen. The CombinerInfo also has a field which indicates how to deal with IllegalOps (ie - should we allow to create them/or legalize them?). A CombinerPass would configure a CombinerInfo, create the GICombiner with the Info, and call GICombiner::combineMachineInstrs(MachineFunction&). This organization is very similar to the GISelLegalizer. llvm-svn: 323392
* [GlobalISel] Don't fall back to FastISel.Amara Emerson2018-01-242-1/+5
| | | | | | | Apparently checking the pass structure isn't enough to ensure that we don't fall back to FastISel, as it's set up as part of the SelectionDAGISel. llvm-svn: 323369
* [globalisel] Introduce LegalityQuery to better encapsulate the legalizer ↵Daniel Sanders2018-01-242-14/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | decisions. NFC. Summary: `getAction(const InstrAspect &) const` breaks encapsulation by exposing the smaller components that are used to decide how to legalize an instruction. This is a problem because we need to change the implementation of LegalizerInfo so that it's able to describe particular type combinations rather than just cartesian products of types. For example, declaring the following setAction({..., 0, s32}, Legal) setAction({..., 0, s64}, Legal) setAction({..., 1, s32}, Legal) setAction({..., 1, s64}, Legal) currently declares these type combinations as legal: {s32, s32} {s64, s32} {s32, s64} {s64, s64} but we currently have no means to say that, for example, {s64, s32} is not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/ G_UNMERGE_VALUES has relationships between the types that are currently described incorrectly. Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics differently to atomics. The necessary information is in the MMO but we have no way to use this in the legalizer. Similarly, there is currently no way for the register type and the memory type to differ so there is no way to cleanly represent extending-load/truncating-store in a way that can't be broken by optimizers (resulting in illegal MIR). This patch introduces LegalityQuery which provides all the information needed by the legalizer to make a decision on whether something is legal and how to legalize it. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar, volkan, reames, bogner Reviewed By: bogner Subscribers: bogner, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D42244 llvm-svn: 323342
* [DebugInfo] Emit DWARF reference for DIVariable 'count' in DISubrangeSander de Smalen2018-01-242-1/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements the codegen of DWARF debug info for non-constant 'count' fields for DISubrange. This is patch [2/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41696 llvm-svn: 323323
* [Metadata] Extend 'count' field of DISubrange to take a metadata nodeSander de Smalen2018-01-242-2/+6
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch extends the DISubrange 'count' field to take either a (signed) constant integer value or a reference to a DILocalVariable or DIGlobalVariable. This is patch [1/3] in a series to extend LLVM's DISubrange Metadata node to support debugging of C99 variable length arrays and vectors with runtime length like the Scalable Vector Extension for AArch64. It is also a first step towards representing more complex cases like arrays in Fortran. Reviewers: echristo, pcc, aprantl, dexonsmith, clayborg, kristof.beyls, dblaikie Reviewed By: aprantl Subscribers: rnk, probinson, fhahn, aemerson, rengolin, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41695 llvm-svn: 323313
* [DAGCombiner] Bail out if vector size is not a multipleSven van Haastregt2018-01-241-0/+4
| | | | | | | | | | | | For the included test case, the DAG transformation concat_vectors(scalar, undef) -> scalar_to_vector(sclr) would attempt to create a v2i32 vector for a v9i8 concat_vector. Bail out to avoid creating a bitcast with mismatching sizes later on. Differential Revision: https://reviews.llvm.org/D42379 llvm-svn: 323312
* [GlobalMerge] Don't merge dllexport globalsMartin Storsjo2018-01-241-1/+2
| | | | | | | | | Merging such globals loses the dllexport attribute. Add a test to check that normal globals still are merged. Differential Revision: https://reviews.llvm.org/D42127 llvm-svn: 323307
* [GISel]: Remove redundant copies at the end of ISelAditya Nandakumar2018-01-241-0/+32
| | | | | | | | | https://reviews.llvm.org/D42402 A lot of these copies are useless (copies b/w VRegs having the same regclass) and should be cleaned up. llvm-svn: 323291
* [safestack] Inline safestack pointer access when possible.Evgeniy Stepanov2018-01-231-1/+50
| | | | | | | | | | | | | | | Summary: This adds an -mllvm flag that forces the use of a runtime function call to get the unsafe stack pointer, the same that is currently used on non-x86, non-aarch64 android. The call may be inlined. Reviewers: pcc Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37405 llvm-svn: 323259
* CodeGen: Fix assertion in ScheduleDAGMILive::scheduleMI due to llvm.dbg.valueYaxun Liu2018-01-231-0/+1
| | | | | | | | | | Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value. This issues causes amdgcn target to assert when compiling some user codes with -g. Differential Revision: https://reviews.llvm.org/D42394 llvm-svn: 323214
* [CGP] Fix the GV handling in complex addressing modeSerguei Katkov2018-01-231-15/+21
| | | | | | | | | | | | | | | If in complex addressing mode the difference is in GV then base reg should not be installed because we plan to use base reg as a merge point of different GVs. This is a fix for PR35980. Reviewers: reames, john.brawn, santosh Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42230 llvm-svn: 323192
* Introduce the "retpoline" x86 mitigation technique for variant #2 of the ↵Chandler Carruth2018-01-225-0/+230
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155
* Fixing warnings caused by commit 323095Marina Yatsina2018-01-223-10/+10
| | | | | Change-Id: I4e1f81db2f5382a820f4016c23b243e4d5aebf51 llvm-svn: 323114
* Separate LoopTraversal, ReachingDefAnalysis and BreakFalseDeps into their ↵Marina Yatsina2018-01-225-417/+547
| | | | | | | | | | | | | | | | | | own files. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40333 Change-Id: Ie5f8eb34d98cfdfae23a3072eb69b5794f0e2d56 llvm-svn: 323095
* Rename ExecutionDepsFix files to ExecutionDomainFixMarina Yatsina2018-01-222-3/+3
| | | | | | | | | | | | | | | | This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40332 Change-Id: I6a048cca7fdafbfc42fb1bac94343e483befded8 llvm-svn: 323094
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-53/+67
| | | | | | | | | | | | | | | | | | - clang-format This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I131b126af13bc743bc5d69d83699e52b9b720979 llvm-svn: 323093
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-66/+0
| | | | | | | | | | | | | | | | | | | | - Moving comments to class definition in header file - Changing comments to doxygen style - Rephrase loop traversal explaining comment This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I9a12618db5b66128611fa71b54a233414f6012ac llvm-svn: 323092
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-56/+49
| | | | | | | | | | | | | | | | | | - Removing LiveRegs This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I8ab56d99951a6d6981542f68d94c1f624f3c9fbf llvm-svn: 323091
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-50/+40
| | | | | | | | | | | | | | | | | | - Changing LiveRegs to be a vector This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: I9cdd364bd7bf2a0bf61ea41a48d4bd310ec3bce4 llvm-svn: 323090
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-32/+38
| | | | | | | | | | | | | | | | | | | | | | - Changing DenseMap<MBB*, LiveReg*> to SmallVector<LiveReg*> - Now the MBB number will be the index of LiveReg in the vector. - Adding asserts This patch is NFC. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: If4a3f141693d0361ddb292432337dbb63a1e69ee llvm-svn: 323089
* ExecutionDepsFix refactoring:Marina Yatsina2018-01-221-73/+41
| | | | | | | | | | | | | | | | | | | | | | | - Remove unneeded includes and unneeded members - Use range iterators - Variable renaming, typedefs, extracting constants - Removing {} from one line ifs This patch is NFC. This is the one of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40330 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40331 Change-Id: Ib59060ab3fa5bee3bf2ca2045c24e572635ee7f6 llvm-svn: 323088
* Separate ExecutionDepsFix into 4 parts:Marina Yatsina2018-01-221-142/+371
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. ReachingDefsAnalysis - Allows to identify for each instruction what is the “closest” reaching def of a certain register. Used by BreakFalseDeps (for clearance calculation) and ExecutionDomainFix (for arbitrating conflicting domains). 2. ExecutionDomainFix - Changes the variant of the instructions in order to minimize domain crossings. 3. BreakFalseDeps - Breaks false dependencies. 4. LoopTraversal - Creatws a traversal order of the basic blocks that is optimal for loops (introduced in revision L293571). Both ExecutionDomainFix and ReachingDefsAnalysis use this to determine the order they will traverse the basic blocks. This also included the following changes to ExcecutionDepsFix original logic: 1. BreakFalseDeps and ReachingDefsAnalysis logic no longer restricted by a register class. 2. ReachingDefsAnalysis tracks liveness of reg units instead of reg indices into a given reg class. Additional changes in affected files: 1. X86 and ARM targets now inherit from ExecutionDomainFix instead of ExecutionDepsFix. BreakFalseDeps also was added to the passes they activate. 2. Comments and references to ExecutionDepsFix replaced with ExecutionDomainFix and BreakFalseDeps, as appropriate. Additional refactoring changes will follow. This commit is (almost) NFC. The only functional change is that now BreakFalseDeps will break dependency for all register classes. Since no additional instructions were added to the list of instructions that have false dependencies, there is no actual change yet. In a future commit several instructions (and tests) will be added. This is the first of multiple patches that fix bugzilla https://bugs.llvm.org/show_bug.cgi?id=33869 Most of the patches are intended at refactoring the existent code. Additional relevant reviews: https://reviews.llvm.org/D40331 https://reviews.llvm.org/D40332 https://reviews.llvm.org/D40333 https://reviews.llvm.org/D40334 Differential Revision: https://reviews.llvm.org/D40330 Change-Id: Icaeb75e014eff96a8f721377783f9a3e6c679275 llvm-svn: 323087
* [SelectionDAG] Fix codegen of vector stores with non byte-sized elements.Jonas Paulsson2018-01-203-6/+38
| | | | | | | | | | | | | | This was completely broken, but hopefully fixed by this patch. In cases where it is needed, a vector with non byte-sized elements is stored by extracting, zero-extending, shift:ing and or:ing the elements into an integer of the same width as the vector, which is then stored. Review: Eli Friedman, Ulrich Weigand https://reviews.llvm.org/D42100#inline-369520 https://bugs.llvm.org/show_bug.cgi?id=35520 llvm-svn: 323042
* CodeGen: handle llvm.used properly for COFFSaleem Abdulrasool2018-01-202-0/+35
| | | | | | | | | `llvm.used` contains a list of pointers to named values which the compiler, assembler, and linker are required to treat as if there is a reference that they cannot see. Ensure that the symbols are preserved by adding an explicit `-include` reference to the linker command. llvm-svn: 323017
* Add optional DICompileUnit to DIBuilder + make outliner debug info use itJessica Paquette2018-01-191-39/+68
| | | | | | | | | | | | | | | | | | | | | | Previously, the DIBuilder didn't expose functionality to set its compile unit in any other way than calling createCompileUnit. This meant that the outliner, which creates new functions, had to create a new compile unit for its debug info. This commit adds an optional parameter in the DIBuilder's constructor which lets you set its CU at construction. It also changes the MachineOutliner so that it keeps track of the DISubprograms for each outlined sequence. If debugging information is requested, then it uses one of the outlined sequence's DISubprograms to grab a CU. It then uses that CU to construct the DISubprogram for the new outlined function. The test has also been updated to reflect this change. See https://reviews.llvm.org/D42254 for more information. Also see the e-mail discussion on D42254 in llvm-commits for more context. llvm-svn: 322992
* [SelectionDAG] Teach computeKnownBits about ATOMIC_CMP_SWAP_WITH_SUCCESS ↵Ulrich Weigand2018-01-191-0/+1
| | | | | | | | | | | | boolean return value The second return value of ATOMIC_CMP_SWAP_WITH_SUCCESS is known to be a boolean, and should therefore be treated by computeKnownBits just like the second return values of SMULO / UMULO. Differential Revision: https://reviews.llvm.org/D42067 llvm-svn: 322985
* Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵Daniel Neilson2018-01-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
* [CodeGen] Unify printing format of debug-location in both MIR and -debugFrancis Visoiu Mistrih2018-01-192-9/+14
| | | | | | Use "debug-location" instead of "; dbg:" in MI::print. llvm-svn: 322936
* Split MachineLICM into EarlyMachineLICM and MachineLICM; NFCMatthias Braun2018-01-193-64/+79
| | | | | | | | | | | | | This avoids playing games with pseudo pass IDs and avoids using an unreliable MRI::isSSA() check to determine whether register allocation has happened. Note that this renames: - MachineLICMID -> EarlyMachineLICM - PostRAMachineLICMID -> MachineLICMID to be consistent with the EarlyTailDuplicate/TailDuplicate naming. llvm-svn: 322927
* Split TailDuplicatePass into pre- and post-RA variant; NFCMatthias Braun2018-01-193-27/+39
| | | | | | | | Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to determine pre-/post-RA state. llvm-svn: 322926
* Revert [CGP] Re-enable Select in complex addressing modeSerguei Katkov2018-01-191-1/+1
| | | | | | One of buildbots failed. Revert for now till fix the issue. llvm-svn: 322923
* AArch64: Fix emergency spillslot being out of reach for large callframesMatthias Braun2018-01-192-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-commit of r322200: The testcase shouldn't hit machineverifiers anymore with r322917 in place. Large callframes (calls with several hundreds or thousands or parameters) could lead to situations in which the emergency spillslot is out of range to be addressed relative to the stack pointer. This commit forces the use of a frame pointer in the presence of large callframes. This commit does several things: - Compute max callframe size at the end of instruction selection. - Add mirFileLoaded target callback. Use it to compute the max callframe size after loading a .mir file when the size wasn't specified in the file. - Let TargetFrameLowering::hasFP() return true if there exists a callframe > 255 bytes. - Always place the emergency spillslot close to FP if we have a frame pointer. - Note that `useFPForScavengingIndex()` would previously return false when a base pointer was available leading to the emergency spillslot getting allocated late (that's the whole effect of this callback). Which made no sense to me so I took this case out: Even though the emergency spillslot is technically not referenced by FP in this case we still want it allocated early. Differential Revision: https://reviews.llvm.org/D40876 llvm-svn: 322919
* [CodeGen][NFC] Rename IsVerbose to IsStandalone in Machine*::printFrancis Visoiu Mistrih2018-01-186-17/+18
| | | | | | | | Committed r322867 too soon. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322868
* [CodeGen] Print RegClasses on MI in verbose modeFrancis Visoiu Mistrih2018-01-186-21/+24
| | | | | | | | | | | | | r322086 removed the trailing information describing reg classes for each register. This patch adds printing reg classes next to every register when individual operands/instructions/basic blocks are printed. In the case of dumping MIR or printing a full function, by default don't print it. Differential Revision: https://reviews.llvm.org/D42239 llvm-svn: 322867
* [TargetLowering] add punctuation for readability; NFCSanjay Patel2018-01-181-1/+1
| | | | llvm-svn: 322855
* [CodeGen][NFC] Refactor MachineInstr::printFrancis Visoiu Mistrih2018-01-181-21/+45
| | | | | | | * Handle more cases where the MI is not attached yet * Add similar asserts like in MIRPrinter::print llvm-svn: 322848
OpenPOWER on IntegriCloud