summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/O3-pipeline.ll
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) ↵Scott Constable2020-06-241-1/+4
| | | | | | | | | | | | | | Gadgets Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph. More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK). Also adds a new target feature to X86: +lvi-load-hardening The feature can be added via the clang CLI using -mlvi-hardening. Differential Revision: https://reviews.llvm.org/D75936
* Revert "[X86] Add a Pass that builds a Condensed CFG for Load Value ↵Craig Topper2020-06-241-3/+1
| | | | | | | | | Injection (LVI) Gadgets" This reverts commit c74dd640fd740c6928f66a39c7c15a014af3f66f. Reverting to address coding standard issues raised in post-commit review.
* [X86] Add a Pass that builds a Condensed CFG for Load Value Injection (LVI) ↵Scott Constable2020-06-241-1/+3
| | | | | | | | | | | | | | Gadgets Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph. More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK). Also adds a new target feature to X86: +lvi-load-hardening The feature can be added via the clang CLI using -mlvi-hardening. Differential Revision: https://reviews.llvm.org/D75936
* [X86] Add RET-hardening Support to mitigate Load Value Injection (LVI)Scott Constable2020-06-241-0/+1
| | | | | | | | | | | | | Adding a pass that replaces every ret instruction with the sequence: pop <scratch-reg> lfence jmp *<scratch-reg> where <scratch-reg> is some available scratch register, according to the calling convention of the function being mitigated. Differential Revision: https://reviews.llvm.org/D75935
* [X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to ↵Scott Constable2020-06-241-1/+1
| | | | | | | | | | | | "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810
* [CodeGen] Move fentry-insert, xray-instrumentation and patchable-function ↵Fangrui Song2020-01-241-3/+3
| | | | | | | | | | | | | | | before addPreEmitPass() This intention is to move patchable-function before aarch64-branch-targets (configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424). This also allows addPreEmitPass() passes to know the precise instruction sizes if they want. Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1. No output difference with this commit and the previous commit. (cherry picked from commit 9a24488cb67a90f889529987275c5e411ce01dda)
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-091-1/+12
| | | | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149
* Revert "[PGO][PGSO] Instrument the code gen / target passes."Hiroshi Yamauchi2019-12-061-12/+1
| | | | | | This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac. This seems to break buildbots.
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-061-1/+12
| | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072
* Reapply r374743 with a fix for the ocaml bindingJoerg Sonnenberger2019-10-141-0/+1
| | | | | | | | | | | | | | | | | | | Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784
* Revert "Add a pass to lower is.constant and objectsize intrinsics"Dmitri Gribenko2019-10-141-1/+0
| | | | | | | This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768
* Add a pass to lower is.constant and objectsize intrinsicsJoerg Sonnenberger2019-10-131-0/+1
| | | | | | | | | | | | | | | | | This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743
* [Dominators][CodeGen] Don't mark MachineDominatorTree as preserved in ↵Jakub Kuderski2019-10-011-0/+2
| | | | | | MachineLICM llvm-svn: 373378
* Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of ↵Dmitri Gribenko2019-09-101-0/+4
| | | | | | | | | CodeGen into opt pipeline."" This reverts commit r371502, it broke tests (clang/test/CodeGenCXX/auto-var-init.cpp). llvm-svn: 371507
* Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵Clement Courbet2019-09-101-4/+0
| | | | | | | | opt pipeline." With a fix for sanitizer breakage (see explanation in D60318). llvm-svn: 371502
* [MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs.Kai Luo2019-07-191-1/+1
| | | | | | | | | | | | Summary: Current PRE hoists common computations into CMBB = DT->findNearestCommonDominator(MBB, MBB1). However, if CMBB is in a hot loop body, we might get performance degradation. Differential Revision: https://reviews.llvm.org/D64394 llvm-svn: 366570
* Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵Clement Courbet2019-06-261-0/+4
| | | | | | | | | | | | | | opt pipeline." Breaks sanitizers: libFuzzer :: cxxstring.test libFuzzer :: memcmp.test libFuzzer :: recommended-dictionary.test libFuzzer :: strcmp.test libFuzzer :: value-profile-mem.test libFuzzer :: value-profile-strcmp.test llvm-svn: 364416
* [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.Clement Courbet2019-06-261-4/+0
| | | | | | | | | This allows later passes (in particular InstCombine) to optimize more cases. One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0. llvm-svn: 364412
* Rename ExpandISelPseudo->FinalizeISel, delay register reservationMatt Arsenault2019-06-191-1/+1
| | | | | | | | | | | This allows targets to make more decisions about reserved registers after isel. For example, now it should be certain there are calls or stack objects in the frame or not, which could have been introduced by legalization. Patch by Matthias Braun llvm-svn: 363757
* Delete x86_64 ShadowCallStack supportVlad Tsyrklevich2019-03-071-1/+0
| | | | | | | | | | | | | | | | | | | | | Summary: ShadowCallStack on x86_64 suffered from the same racy security issues as Return Flow Guard and had performance overhead as high as 13% depending on the benchmark. x86_64 ShadowCallStack was always an experimental feature and never shipped a runtime required to support it, as such there are no expected downstream users. Reviewers: pcc Reviewed By: pcc Subscribers: mgorny, javed.absar, hiraditya, jdoerfert, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D59034 llvm-svn: 355624
* Revert "Revert r347596 "Support for inserting profile-directed cache ↵Mircea Trofin2018-11-301-0/+2
| | | | | | | | | | | | | | | | | | | prefetches"" Summary: This reverts commit d8517b96dfbd42e6a8db33c50d1fa1e58e63fbb9. Fix: correct the use of DenseMap. Reviewers: davidxl, hans, wmi Reviewed By: wmi Subscribers: mgorny, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D55088 llvm-svn: 347938
* Revert r347596 "Support for inserting profile-directed cache prefetches"Hans Wennborg2018-11-291-2/+0
| | | | | | | | | | | | It causes asserts building BoringSSL. See https://crbug.com/91009#c3 for repro. This also reverts the follow-ups: Revert r347724 "Do not insert prefetches with unsupported memory operands." Revert r347606 "[X86] Add dependency from X86 to ProfileData after rL347596" Revert r347607 "Add new passes to X86 pipeline tests" llvm-svn: 347864
* Add new passes to X86 pipeline testsMircea Trofin2018-11-261-0/+2
| | | | | | | | | | | | | | Summary: Fixes test failures introduced by rL347596. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54916 llvm-svn: 347607
* [X86] Disable Condbr_merge passRong Xu2018-11-161-1/+0
| | | | | | | Disable Condbr_merge pass for now due to PR39658. Will reenable the pass once the bug is fixed. llvm-svn: 347079
* [X86] Fix pipeline tests when enabling MIR verification, NFCReid Kleckner2018-10-241-1/+4
| | | | llvm-svn: 345226
* Recommit r343993: [X86] condition branches folding for three-way conditional ↵Rong Xu2018-10-091-0/+1
| | | | | | | | codes Fix the memory issue exposed by sanitizer. llvm-svn: 344085
* [X86] Revert r343993 condition branches folding for three-way conditional codesRong Xu2018-10-081-1/+0
| | | | | | Some buildbots failed. llvm-svn: 343998
* [X86] condition branches folding for three-way conditional codesRong Xu2018-10-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements a pass that optimizes condition branches on x86 by taking advantage of the three-way conditional code generated by compare instructions. Currently, it tries to hoisting EQ and NE conditional branch to a dominant conditional branch condition where the same EQ/NE conditional code is computed. An example: bb_0: cmp %0, 19 jg bb_1 jmp bb_2 bb_1: cmp %0, 40 jg bb_3 jmp bb_4 bb_4: cmp %0, 20 je bb_5 jmp bb_6 Here we could combine the two compares in bb_0 and bb_4 and have the following code: bb_0: cmp %0, 20 jg bb_1 jl bb_2 jmp bb_5 bb_1: cmp %0, 40 jg bb_3 jmp bb_6 For the case of %0 == 20 (bb_5), we eliminate two jumps, and the control height for bb_6 is also reduced. bb_4 is gone after the optimization. This optimization is motivated by the branch pattern generated by the switch lowering: we always have pivot-1 compare for the inner nodes and we do a pivot compare again the leaf (like above pattern). This pass currently is enabled on Intel's Sandybridge and later arches. Some reviewers pointed out that on some arches (like AMD Jaguar), this pass may increase branch density to the point where it hurts the performance of the branch predictor. Differential Revision: https://reviews.llvm.org/D46662 llvm-svn: 343993
* Re-submitting changes in D51550 because it failed to patch.Christy Lee2018-09-241-0/+2
| | | | | | | | | | | | Reviewers: javed.absar, trentxintong, courbet Reviewed By: trentxintong Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52433 llvm-svn: 342919
* [x86/SLH] Add a real Clang flag and LLVM IR attribute for SpeculativeChandler Carruth2018-09-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Load Hardening. Wires up the existing pass to work with a proper IR attribute rather than just a hidden/internal flag. The internal flag continues to work for now, but I'll likely remove it soon. Most of the churn here is adding the IR attribute. I talked about this Kristof Beyls and he seemed at least initially OK with this direction. The idea of using a full attribute here is that we *do* expect at least some forms of this for other architectures. There isn't anything *inherently* x86-specific about this technique, just that we only have an implementation for x86 at the moment. While we could potentially expose this as a Clang-level attribute as well, that seems like a good question to defer for the moment as it isn't 100% clear whether that or some other programmer interface (or both?) would be best. We'll defer the programmer interface side of this for now, but at least get to the point where the feature can be enabled without relying on implementation details. This also allows us to do something that was really hard before: we can enable *just* the indirect call retpolines when using SLH. For x86, we don't have any other way to mitigate indirect calls. Other architectures may take a different approach of course, and none of this is surfaced to user-level flags. Differential Revision: https://reviews.llvm.org/D51157 llvm-svn: 341363
* [ShrinkWrap] Add optimization remarks to the shrink-wrapping passFrancis Visoiu Mistrih2018-06-051-1/+1
| | | | | | | | Start by emitting remarks for very basic unsupported cases such as irreducible CFGs and EHFunclets. The end goal is to be able to cover all the cases where we give up with an explanation. llvm-svn: 333972
* Correct dwarf unwind information in function epiloguePetar Jovanovic2018-04-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch aims to provide correct dwarf unwind information in function epilogue for X86. It consists of two parts. The first part inserts CFI instructions that set appropriate cfa offset and cfa register in emitEpilogue() in X86FrameLowering. This part is X86 specific. The second part is platform independent and ensures that: * CFI instructions do not affect code generation (they are not counted as instructions when tail duplicating or tail merging) * Unwind information remains correct when a function is modified by different passes. This is done in a late pass by analyzing information about cfa offset and cfa register in BBs and inserting additional CFI directives where necessary. Added CFIInstrInserter pass: * analyzes each basic block to determine cfa offset and register are valid at its entry and exit * verifies that outgoing cfa offset and register of predecessor blocks match incoming values of their successors * inserts additional CFI directives at basic block beginning to correct the rule for calculating CFA Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. CFIInstrInserter is currently run only on X86, but can be used by any target that implements support for adding CFI instructions in epilogue. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D42848 llvm-svn: 330706
* [x86] Fix PR37100 by teaching the EFLAGS copy lowering to rewrite usesChandler Carruth2018-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | across basic blocks in the limited cases where it is very straight forward to do so. This will also be useful for other places where we do some limited EFLAGS propagation across CFG edges and need to handle copy rewrites afterward. I think this is rapidly approaching the maximum we can and should be doing here. Everything else begins to require either heroic analysis to prove how to do PHI insertion manually, or somehow managing arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these seem at all promising so if those cases come up, we'll almost certainly need to rewrite the parts of LLVM that produce those patterns. We do now require dominator trees in order to reliably diagnose patterns that would require PHI nodes. This is a bit unfortunate but it seems better than the completely mysterious crash we would get otherwise. Differential Revision: https://reviews.llvm.org/D45673 llvm-svn: 330264
* [x86] Introduce a pass to begin more systematically fixing PR36028 and ↵Chandler Carruth2018-04-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 llvm-svn: 329657
* Remove MachineLoopInfo dependency from AsmPrinter.Michael Zolotukhin2018-04-091-2/+0
| | | | | | | | | | | | | | | | | | | Summary: Currently MachineLoopInfo is used in only two places: 1) for computing IsBasicBlockInsideInnermostLoop field of MCCodePaddingContext, and it is never used. 2) in emitBasicBlockLoopComments, which is called only if `isVerbose()` is true. Despite that, we currently have a dependency on MachineLoopInfo, which makes pass manager to compute it and MachineDominator Tree. This patch removes the use (1) and makes the use (2) lazy, thus avoiding some redundant recomputations. Reviewers: opaparo, gadi.haber, rafael, craig.topper, zvi Subscribers: rengolin, javed.absar, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D44812 llvm-svn: 329542
* Add the ShadowCallStack passVlad Tsyrklevich2018-04-041-0/+1
| | | | | | | | | | | | | | | | | | | | Summary: The ShadowCallStack pass instruments functions marked with the shadowcallstack attribute. The instrumented prolog saves the return address to [gs:offset] where offset is stored and updated in [gs:0]. The instrumented epilog loads/updates the return address from [gs:0] and checks that it matches the return address on the stack before returning. Reviewers: pcc, vitalybuka Reviewed By: pcc Subscribers: cryptoad, eugenis, craig.topper, mgorny, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44802 llvm-svn: 329139
* [X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346Lama Saba2018-04-021-0/+1
| | | | | | | | | | | | | | | | | If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory. A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load. The estimated penalty for a store forward block is ~13 cycles. This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence of a load and a store. The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies. breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM. Differential revision: https://reviews.llvm.org/D41330 Change-Id: Ib48836ccdf6005989f7d4466fa2035b7b04415d9 llvm-svn: 328973
* [PostRAMachineSink] preserve CFGJun Bum Lim2018-03-281-2/+0
| | | | | | | | | | | | | | Summary: Mark CFG is preserved since this pass do not make any change in CFG. Reviewers: sebpop, mzolotukhin, mcrosier Reviewed By: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44845 llvm-svn: 328727
* Reapply "[test] Add tests for llc passes pipelines." with a fix for bots ↵Michael Zolotukhin2018-03-221-0/+170
| | | | | | with expensive checks on. llvm-svn: 328267
* Revert "[test] Add tests for llc passes pipelines."Jonas Devlieghere2018-03-221-167/+0
| | | | | | | This reverts r328159 because the two AArch64 tests fail on GreenDragon: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/11030/ llvm-svn: 328188
* [test] Add tests for llc passes pipelines.Michael Zolotukhin2018-03-211-0/+167
This is basically an extension of existing test test/CodeGen/X86/O0-pipeline.ll introduced in r302608. llvm-svn: 328159
OpenPOWER on IntegriCloud