summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/TargetPassConfig.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [MergeICmps] Re-commit rL324317 "Enable the MergeICmps Pass by default."Clement Courbet2018-02-071-5/+4
| | | | | | | | | | | | | | | | | | With fixes from rL324341. Original commit message: [MergeICmps] Enable the MergeICmps Pass by default. Summary: Now that PR33325 is fixed, this should always improve the generated code. Reviewers: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42793 llvm-svn: 324465
* Revert "[MergeICmps] Enable the MergeICmps Pass by default."Clement Courbet2018-02-061-4/+5
| | | | | | | | Breaks clang-ppc64be-linux-multistage buildbot. This reverts commit 515bab711f308c2e8299c49dd8c84ea6a2e0b60e. llvm-svn: 324319
* [MergeICmps] Enable the MergeICmps Pass by default.Clement Courbet2018-02-061-5/+4
| | | | | | | | | | | | Summary: Now that PR33325 is fixed, this should always improve the generated code. Reviewers: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42793 llvm-svn: 324317
* [MachineCopyPropagation] Extend pass to do COPY source forwardingGeoff Berry2018-02-011-0/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: This change extends MachineCopyPropagation to do COPY source forwarding and adds an additional run of the pass to the default pass pipeline just after register allocation. This version of this patch uses the newly added MachineOperand::isRenamable bit to avoid forwarding registers is such a way as to violate constraints that aren't captured in the Machine IR (e.g. ABI or ISA constraints). This change is a continuation of the work started in D30751. Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits Differential Revision: https://reviews.llvm.org/D41835 llvm-svn: 323991
* [GlobalISel] Don't fall back to FastISel.Amara Emerson2018-01-241-0/+2
| | | | | | | Apparently checking the pass structure isn't enough to ensure that we don't fall back to FastISel, as it's set up as part of the SelectionDAGISel. llvm-svn: 323369
* Introduce the "retpoline" x86 mitigation technique for variant #2 of the ↵Chandler Carruth2018-01-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155
* Split MachineLICM into EarlyMachineLICM and MachineLICM; NFCMatthias Braun2018-01-191-10/+4
| | | | | | | | | | | | | This avoids playing games with pseudo pass IDs and avoids using an unreliable MRI::isSSA() check to determine whether register allocation has happened. Note that this renames: - MachineLICMID -> EarlyMachineLICM - PostRAMachineLICMID -> MachineLICMID to be consistent with the EarlyTailDuplicate/TailDuplicate naming. llvm-svn: 322927
* Split TailDuplicatePass into pre- and post-RA variant; NFCMatthias Braun2018-01-191-3/+1
| | | | | | | | Split TailDuplicatePass into EarlyTailDuplicate and TailDuplicate. This avoids playing games with fake pass IDs and using MRI::isSSA() to determine pre-/post-RA state. llvm-svn: 322926
* Fix the failure caused by r322773Volkan Keles2018-01-181-8/+3
| | | | | | Do not run GlobalISel if `-fast-isel=0 -global-isel=false`. llvm-svn: 322800
* Add a TargetOption to enable/disable GlobalISelVolkan Keles2018-01-171-15/+14
| | | | | | | | | | | | | | | | | | | | | Summary: This patch adds a new target option in order to control GlobalISel. This will allow the users to enable/disable GlobalISel prior to the backend by calling `TargetMachine::setGlobalISel(bool Enable)`. No test case as there is already a test to check GlobalISel command line options. See: CodeGen/AArch64/GlobalISel/gisel-commandline-option.ll. Reviewers: qcolombet, aemerson, ab, dsanders Reviewed By: qcolombet Subscribers: rovka, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D42137 llvm-svn: 322773
* [AArch64][GlobalISel] Enable GlobalISel at -O0 by defaultAmara Emerson2018-01-021-2/+10
| | | | | | | | | | | Tests updated to explicitly use fast-isel at -O0 instead of implicitly. This change also allows an explicit -fast-isel option to override an implicitly enabled global-isel. Otherwise -fast-isel would have no effect at -O0. Differential Revision: https://reviews.llvm.org/D41362 llvm-svn: 321655
* Mark all library options as hidden.Zachary Turner2017-12-011-17/+15
| | | | | | | | | | | | | | | | | These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing. This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on. Differential Revision: https://reviews.llvm.org/D40674 llvm-svn: 319505
* Rename CountingFunctionInserter and use for both mcount and cygprofile ↵Hans Wennborg2017-11-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | calls, before and after inlining Clang implements the -finstrument-functions flag inherited from GCC, which inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit. This is useful for getting a trace of how the functions in a program are executed. Normally, the calls remain even if a function is inlined into another function, but it is useful to be able to turn this off for users who are interested in a lower-level trace, i.e. one that reflects what functions are called post-inlining. (We use this to generate link order files for Chromium.) LLVM already has a pass for inserting similar instrumentation calls to mcount(), which it does after inlining. This patch renames and extends that pass to handle calls both to mcount and the cygprofile functions, before and/or after inlining as controlled by function attributes. Differential Revision: https://reviews.llvm.org/D39287 llvm-svn: 318195
* re-land [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."Clement Courbet2017-11-031-2/+8
| | | | | | Fix undefined references: ExpandMemCmp belongs to CodeGen/, not Scalar/. llvm-svn: 317318
* Revert "[ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass."Clement Courbet2017-11-021-8/+2
| | | | | | | | | undefined reference to `llvm::TargetPassConfig::ID' on clang-ppc64le-linux-multistage This reverts commit eea333c33fa73ad225ef28607795984829f65688. llvm-svn: 317213
* [ExpandMemCmp] Split ExpandMemCmp from CodeGen into its own pass.Clement Courbet2017-11-021-2/+8
| | | | | | | | | | | | | | | | | Summary: This is mostly a noop (most of the test diffs are renamed blocks). There are a few temporary register renames (eax<->ecx) and a few blocks are shuffled around. See the discussion in PR33325 for more details. Reviewers: spatel Subscribers: mgorny Differential Revision: https://reviews.llvm.org/D39456 llvm-svn: 317211
* Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"Matthias Braun2017-10-121-2/+2
| | | | | | | | | | Reverting to investigate layering effects of MCJIT not linking libCodeGen but using TargetMachine::getNameWithPrefix() breaking the lldb bots. This reverts commit r315633. llvm-svn: 315637
* TargetMachine: Merge TargetMachine and LLVMTargetMachineMatthias Braun2017-10-121-2/+2
| | | | | | | | | | | | | | | Merge LLVMTargetMachine into TargetMachine. - There is no in-tree target anymore that just implements TargetMachine but not LLVMTargetMachine. - It should still be possible to stub out all the various functions in case a target does not want to use lib/CodeGen - This simplifies the code and avoids methods ending up in the wrong interface. Differential Revision: https://reviews.llvm.org/D38489 llvm-svn: 315633
* [MachineOutliner] Disable outlining from LinkOnceODRs by defaultJessica Paquette2017-10-071-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Say you have two identical linkonceodr functions, one in M1 and one in M2. Say that the outliner outlines A,B,C from one function, and D,E,F from another function (where letters are instructions). Now those functions are not identical, and cannot be deduped. Locally to M1 and M2, these outlining choices would be good-- to the whole program, however, this might not be true! To mitigate this, this commit makes it so that the outliner sees linkonceodr functions as unsafe to outline from. It also adds a flag, -enable-linkonceodr-outlining, which allows the user to specify that they want to outline from such functions when they know what they're doing. Changing this handles most code size regressions in the test suite caused by competing with linker dedupe. It also doesn't have a huge impact on the code size improvements from the outliner. There are 6 tests that regress > 5% from outlining WITH linkonceodrs to outlining WITHOUT linkonceodrs. Overall, most tests either improve or are not impacted. Not outlined vs outlined without linkonceodrs: https://hastebin.com/raw/qeguxavuda Not outlined vs outlined with linkonceodrs: https://hastebin.com/raw/edepoqoqic Outlined with linkonceodrs vs outlined without linkonceodrs: https://hastebin.com/raw/awiqifiheb Numbers generated using compare.py with -m size.__text. Tests run for AArch64 with -Oz -mllvm -enable-machine-outliner -mno-red-zone. llvm-svn: 315136
* Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵Geoff Berry2017-10-031-9/+0
| | | | | | | | | | forwarding"" This reverts commit r314729. Another bug has been encountered in an out-of-tree target reported by Quentin. llvm-svn: 314814
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-10-021-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issues addressed since original review: - Avoid bug in regalloc greedy/machine verifier when forwarding to use in an instruction that re-defines the same virtual register. - Fixed bug when forwarding to use in EarlyClobber instruction slot. - Fixed incorrect forwarding to register definitions that showed up in explicit_uses() iterator (e.g. in INLINEASM). - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 314729
* Update branch coalescing to be a PowerPC specific passLei Huang2017-09-121-3/+0
| | | | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Pass is currently off by default. Enabled via -enable-ppc-branch-coalesce. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 313061
* Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵Sam McCall2017-09-041-9/+0
| | | | | | | | | | forwarding"" This crashes on boringSSL on PPC (will send reduced testcase) This reverts commit r312328. llvm-svn: 312490
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-09-011-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | Issues addressed since original review: - Moved removal of dead instructions found by LiveIntervals::shrinkToUses() outside of loop iterating over instructions to avoid instructions being deleted while pointed to by iterator. - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312328
* Reland rL312315: [MergeICmps] MergeICmps is a new optimization pass that ↵Clement Courbet2017-09-011-0/+8
| | | | | | | | | | turns chains of integer Add missing header. This reverts commit 86dd6335cf7607af22f383a9a8e072ba929848cf. llvm-svn: 312322
* Revert "[MergeICmps] MergeICmps is a new optimization pass that turns chains ↵Clement Courbet2017-09-011-8/+0
| | | | | | | | | | of integer" Break build This reverts commit d07ab866f7f88f81e49046d691a80dcd32d7198b. llvm-svn: 312317
* [MergeICmps] MergeICmps is a new optimization pass that turns chains of integerClement Courbet2017-09-011-0/+8
| | | | | | | | | | | | | | | | | comparisons into memcmp. Thanks to recent improvements in the LLVM codegen, the memcmp is typically inlined as a chain of efficient hardware comparisons. This typically benefits C++ member or nonmember operator==(). For now this is disabled by default until: - https://bugs.llvm.org/show_bug.cgi?id=33329 is complete - Benchmarks show that this is always useful. Differential Revision: https://reviews.llvm.org/D33987 llvm-svn: 312315
* Temporarily revert "Update branch coalescing to be a PowerPC specific pass"Eric Christopher2017-08-311-0/+3
| | | | | | | | From comments and code review it wasn't intended to be enabled by default yet. This reverts commit r311588. llvm-svn: 312214
* Revert r312154 "Re-enable "[MachineCopyPropagation] Extend pass to do COPY ↵Hans Wennborg2017-08-301-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | source forwarding"" It caused PR34387: Assertion failed: (RegNo < NumRegs && "Attempting to access record for invalid register number!") > Issues identified by buildbots addressed since original review: > - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. > - The pass no longer forwards COPYs to physical register uses, since > doing so can break code that implicitly relies on the physical > register number of the use. > - The pass no longer forwards COPYs to undef uses, since doing so > can break the machine verifier by creating LiveRanges that don't > end on a use (since the undef operand is not considered a use). > > [MachineCopyPropagation] Extend pass to do COPY source forwarding > > This change extends MachineCopyPropagation to do COPY source forwarding. > > This change also extends the MachineCopyPropagation pass to be able to > be run during register allocation, after physical registers have been > assigned, but before the virtual registers have been re-written, which > allows it to remove virtual register COPY LiveIntervals that become dead > through the forwarding of all of their uses. llvm-svn: 312178
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-08-301-0/+9
| | | | | | | | | | | | | | | | | | | | | | | Issues identified by buildbots addressed since original review: - Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907. - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. llvm-svn: 312154
* Update branch coalescing to be a PowerPC specific passLei Huang2017-08-231-3/+0
| | | | | | | | | | Implementing this pass as a PowerPC specific pass. Branch coalescing utilizes the analyzeBranch method which currently does not include any implicit operands. This is not an issue on PPC but must be handled on other targets. Differential Revision : https: // reviews.llvm.org/D32776 llvm-svn: 311588
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" ↵Geoff Berry2017-08-181-9/+0
| | | | | | | | | | | round 2 This reverts commit r311135. sanitizer-x86_64-linux-android buildbot is timing out with just this patch applied. llvm-svn: 311142
* Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵Geoff Berry2017-08-171-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | forwarding" Two issues identified by buildbots were addressed: - The pass no longer forwards COPYs to physical register uses, since doing so can break code that implicitly relies on the physical register number of the use. - The pass no longer forwards COPYs to undef uses, since doing so can break the machine verifier by creating LiveRanges that don't end on a use (since the undef operand is not considered a use). [MachineCopyPropagation] Extend pass to do COPY source forwarding This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311135
* Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"Geoff Berry2017-08-171-9/+0
| | | | | | | | | | This reverts commit r311038. Several buildbots are breaking, and at least one appears to be due to the forwarding of physical regs enabled by this change. Reverting while I investigate further. llvm-svn: 311062
* [MachineCopyPropagation] Extend pass to do COPY source forwardingGeoff Berry2017-08-161-0/+9
| | | | | | | | | | | | | | | | | | This change extends MachineCopyPropagation to do COPY source forwarding. This change also extends the MachineCopyPropagation pass to be able to be run during register allocation, after physical registers have been assigned, but before the virtual registers have been re-written, which allows it to remove virtual register COPY LiveIntervals that become dead through the forwarding of all of their uses. Reviewers: qcolombet, javed.absar, MatzeB, jonpa Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny Differential Revision: https://reviews.llvm.org/D30751 llvm-svn: 311038
* IPRA: Allow target to enable IPRA by defaultMatt Arsenault2017-08-141-0/+10
| | | | llvm-svn: 310876
* IPRA: Run RegUsageInfoPropagate much laterMatt Arsenault2017-08-141-3/+3
| | | | | | | | | | | | | | This was running immediately after isel, before isel pseudos were even expanded which is really unreasonable. Move this to before pre-reglloc passes in case some other pre-regalloc pass wants to use the updated regmask info. Fixes one of the reasons IPRA doesn't do anything on AMDGPU currently. Tests will be included with future patch after a few more are fixed. llvm-svn: 310875
* [TargetPassConfig] Feature generic options to setup start/stop-after/beforeQuentin Colombet2017-07-311-0/+85
| | | | | | | | | | | | | | | | This patch refactors the code used in llc such that all the users of the addPassesToEmitFile API have access to a homogeneous way of handling start/stop-after/before options right out of the box. In particular, just invoking addPassesToEmitFile will set the proper pipeline without additional effort (modulo parsing a .mir file if the start-before/after options are used. NFC. Differential Revision: https://reviews.llvm.org/D30913 llvm-svn: 309599
* Revert "r306529 - [X86] Correct dwarf unwind information in function epilogue"Daniel Jasper2017-06-291-7/+0
| | | | | | | | | | I am 99% sure that this breaks the PPC ASAN build bot: http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio If it doesn't go back to green, we can recommit (and fix the original commit message at the same time :) ). llvm-svn: 306676
* [X86] Correct dwarf unwind information in function epiloguePetar Jovanovic2017-06-281-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CFI instructions that set appropriate cfa offset and cfa register are now inserted in emitEpilogue() in X86FrameLowering. Majority of the changes in this patch: 1. Ensure that CFI instructions do not affect code generation. 2. Enable maintaining correct information about cfa offset and cfa register in a function when basic blocks are reordered, merged, split, duplicated. These changes are target independent and described below. Changed CFI instructions so that they: 1. are duplicable 2. are not counted as instructions when tail duplicating or tail merging 3. can be compared as equal Add information to each MachineBasicBlock about cfa offset and cfa register that are valid at its entry and exit (incoming and outgoing CFI info). Add support for updating this information when basic blocks are merged, split, duplicated, created. Add a verification pass (CFIInfoVerifier) that checks that outgoing cfa offset and register of predecessor blocks match incoming values of their successors. Incoming and outgoing CFI information is used by a late pass (CFIInstrInserter) that corrects CFA calculation rule for a basic block if needed. That means that additional CFI instructions get inserted at basic block beginning to correct the rule for calculating CFA. Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D18046 llvm-svn: 306529
* [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-06-261-14/+22
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 306341
* CodeGen/LLVMTargetMachine: Refactor ISel pass construction; NFCIMatthias Braun2017-06-061-0/+71
| | | | | | | | | | | | - Move ISel (and pre-isel) pass construction into TargetPassConfig - Extract AsmPrinter construction into a helper function Putting the ISel code into TargetPassConfig seems a lot more natural and both changes together make make it easier to build custom pipelines involving .mir in an upcoming commit. This moves MachineModuleInfo to an earlier place in the pass pipeline which shouldn't have any effect. llvm-svn: 304754
* TargetMachine: Indicate whether machine verifier passes.Matthias Braun2017-05-311-1/+6
| | | | | | | | | | | | | This adds a callback to the LLVMTargetMachine that lets target indicate that they do not pass the machine verifier checks in all cases yet. This is intended to be a temporary measure while the targets are fixed allowing us to enable the machine verifier by default with EXPENSIVE_CHECKS enabled! Differential Revision: https://reviews.llvm.org/D33696 llvm-svn: 304320
* TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFCMatthias Braun2017-05-301-5/+5
| | | | | | | | | | | TargetPassConfig is not useful for targets that do not use the CodeGen library, so we may just as well store a pointer to an LLVMTargetMachine instead of just to a TargetMachine. While at it, also change the constructor to take a reference instead of a pointer as the TM must not be nullptr. llvm-svn: 304247
* [LegacyPassManager] Remove TargetMachine constructorsFrancis Visoiu Mistrih2017-05-181-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides a new way to access the TargetMachine through TargetPassConfig, as a dependency. The patterns replaced here are: * Passes handling a null TargetMachine call `getAnalysisIfAvailable<TargetPassConfig>`. * Passes not handling a null TargetMachine `addRequired<TargetPassConfig>` and call `getAnalysis<TargetPassConfig>`. * MachineFunctionPasses now use MF.getTarget(). * Remove all the TargetMachine constructors. * Remove INITIALIZE_TM_PASS. This fixes a crash when running `llc -start-before prologepilog`. PEI needs StackProtector, which gets constructed without a TargetMachine by the pass manager. The StackProtector pass doesn't handle the case where there is no TargetMachine, so it segfaults. Related to PR30324. Differential Revision: https://reviews.llvm.org/D33222 llvm-svn: 303360
* Only enable LiveRangeShrink for x86.Dehao Chen2017-05-171-3/+0
| | | | | | | | | | | | | | Summary: Moving LiveRangeShrink to x86 as this pass is mostly useful for archtectures with great register pressure. Reviewers: MatzeB, qcolombet Reviewed By: qcolombet Subscribers: jholewinski, jyknight, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33294 llvm-svn: 303292
* Make sure -optimize-regalloc=false is used correctly by user.Jonas Paulsson2017-05-171-10/+14
| | | | | | | | | | | Don't allow -optimize-regalloc=false with -regalloc given for anything other than 'fast'. The other register allocators depend on the supporting passes added by addOptimizedRegAlloc(). Reviewers: Quentin Colombet, Matthias Braun https://reviews.llvm.org/D33181 llvm-svn: 303238
* [X86] Relocate code of replacement of subtarget unsupported masked memory ↵Ayman Musa2017-05-151-0/+5
| | | | | | | | | | | | | | intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050
* Add LiveRangeShrink pass to shrink live range within BB.Dehao Chen2017-05-121-0/+3
| | | | | | | | | | | | | | Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB. Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb Reviewed By: MatzeB, andreadb Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D32563 llvm-svn: 302938
* Add a late IR expansion pass for the experimental reduction intrinsics.Amara Emerson2017-05-101-0/+3
| | | | | | | | | This pass uses a new target hook to decide whether or not to expand a particular intrinsic to the shuffevector sequence. Differential Revision: https://reviews.llvm.org/D32245 llvm-svn: 302631
OpenPOWER on IntegriCloud