summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86MCInstLower.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fix a bunch more layering of CodeGen headers that are in TargetDavid Blaikie2017-11-171-1/+1
| | | | | | | | All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490
* [codeview] Implement FPO data assembler directivesReid Kleckner2017-10-111-29/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds a set of new directives that describe 32-bit x86 prologues. The directives are limited and do not expose the full complexity of codeview FPO data. They are merely a convenience for the compiler to generate more readable assembly so we don't need to generate tons of labels in CodeGen. If our prologue emission changes in the future, we can change the set of available directives to suit our needs. These are modelled after the .seh_ directives, which use a different format that interacts with exception handling. The directives are: .cv_fpo_proc _foo .cv_fpo_pushreg ebp/ebx/etc .cv_fpo_setframe ebp/esi/etc .cv_fpo_stackalloc 200 .cv_fpo_endprologue .cv_fpo_endproc .cv_fpo_data _foo I tried to follow the implementation of ARM EHABI CFI directives by sinking most directives out of MCStreamer and into X86TargetStreamer. This helps avoid polluting non-X86 code with WinCOFF specific logic. I used cdb to confirm that this can show locals in parent CSRs in a few cases, most importantly the one where we use ESI as a frame pointer, i.e. the one in http://crbug.com/756153#c28 Once we have cdb integration in debuginfo-tests, we can add integration tests there. Reviewers: majnemer, hans Subscribers: aemerson, mgorny, kristof.beyls, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D38776 llvm-svn: 315513
* Revert r314249 "Recommit r314151 "[X86] Make all the NOREX CodeGenOnly ↵Craig Topper2017-09-271-0/+4
| | | | | | | | instructions into postRA pseudos like the NOREX version of TEST.""" This caused PR34751 llvm-svn: 314339
* Recommit r314151 "[X86] Make all the NOREX CodeGenOnly instructions into ↵Craig Topper2017-09-261-4/+0
| | | | | | | | postRA pseudos like the NOREX version of TEST."" The late MOV8rr_NOREX that caused the crash has been removed. llvm-svn: 314249
* Revert "[X86] Make all the NOREX CodeGenOnly instructions into postRA ↵Benjamin Kramer2017-09-261-0/+4
| | | | | | | | pseudos like the NOREX version of TEST." Makes llc crash. This reverts commit r314151. llvm-svn: 314199
* [X86] Make all the NOREX CodeGenOnly instructions into postRA pseudos like ↵Craig Topper2017-09-251-4/+0
| | | | | | the NOREX version of TEST. llvm-svn: 314151
* [XRay][CodeGen] Use PIC-friendly code in XRay sleds and remove synthetic ↵Dean Michael Berris2017-09-041-35/+40
| | | | | | | | | | | | | | | | | references in .text Summary: This is a re-roll of D36615 which uses PLT relocations in the back-end to the call to __xray_CustomEvent() when building in -fPIC and -fxray-instrument mode. Reviewers: pcc, djasper, bkramer Subscribers: sdardis, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D37373 llvm-svn: 312466
* Revert r311525: "[XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove ↵Daniel Jasper2017-08-311-38/+35
| | | | | | | | synthetic references in .text" Breaks builds internally. Will forward repo instructions to author. llvm-svn: 312243
* [XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic ↵Dean Michael Berris2017-08-231-35/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | references in .text Summary: This change achieves two things: - Redefine the Custom Event handling instrumentation points emitted by the compiler to not require dynamic relocation of references to the __xray_CustomEvent trampoline. - Remove the synthetic reference we emit at the end of a function that we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER associated with the section where the function is defined. To achieve the custom event handling change, we've had to introduce the concept of sled versioning -- this will need to be supported by the runtime to allow us to understand how to turn on/off the new version of the custom event handling sleds. That change has to land first before we change the way we write the sleds. To remove the synthetic reference, we rely on a relatively new linker feature that preserves the sections that are associated with each other. This allows us to limit the effects on the .text section of ELF binaries. Because we're still using absolute references that are resolved at runtime for the instrumentation map (and function index) maps, we mark these sections write-able. In the future we can re-define the entries in the map to use relative relocations instead that can be statically determined by the linker. That change will be a bit more invasive so we defer this for later. Depends on D36816. Reviewers: dblaikie, echristo, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36615 llvm-svn: 311525
* [X86] Add comment string for broadcast loads from the constant pool.Craig Topper2017-07-041-37/+156
| | | | | | | | | | | | | | | | | Summary: When broadcasting from the constant pool its useful to print out the final vector similar to what we do for normal moves from the constant pool. I changed only a couple tests that were broadcast focused. One of them had been previously hand tweaked after running the script so that it could check the constant pool declaration. But I think this patch makes that unnecessary now since we can check the comment instead. Reviewers: spatel, RKSimon, zvi Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34923 llvm-svn: 307062
* fix trivial typos; NFCHiroshi Inoue2017-07-021-1/+1
| | | | | | suport -> support llvm-svn: 306968
* Move Object format code to lib/BinaryFormat.Zachary Turner2017-06-071-1/+1
| | | | | | | | | | | | This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* [XRay] Custom event logging intrinsicDean Michael Berris2017-05-081-0/+80
| | | | | | | | | | | | | | | | | | | | This patch introduces an LLVM intrinsic and a target opcode for custom event logging in XRay. Initially, its use case will be to allow users of XRay to log some type of string ("poor man's printf"). The target opcode compiles to a noop sled large enough to enable calling through to a runtime-determined relative function call. At runtime, when X-Ray is enabled, the sled is replaced by compiler-rt with a trampoline to the logic for creating the custom log entries. Future patches will implement the compiler-rt parts and clang-side support for emitting the IR corresponding to this intrinsic. Reviewers: timshen, dberris Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D27503 llvm-svn: 302405
* This patch closes PR#32216: Better testing of schedule model instruction ↵Andrew V. Tischenko2017-04-141-7/+11
| | | | | | | | latencies/throughputs. The details are here: https://reviews.llvm.org/D30941 llvm-svn: 300311
* [X86][XOP] Reduce the size of a multiclass by moving more stuff to ↵Craig Topper2017-02-181-4/+4
| | | | | | | | parameters instead of doing 128-bit and 256-bit simultaneously. This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions. llvm-svn: 295579
* [X86] Re-enable conditional tail calls and fix PR31257.Hans Wennborg2017-02-161-1/+8
| | | | | | | | | | | This reverts r294348, which removed support for conditional tail calls due to the PR above. It fixes the PR by marking live registers as implicitly used and defined by the now predicated tailcall. This is similar to how IfConversion predicates instructions. Differential Revision: https://reviews.llvm.org/D29856 llvm-svn: 295262
* [X86] Disable conditional tail calls (PR31257)Hans Wennborg2017-02-071-8/+1
| | | | | | | | | They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. llvm-svn: 294348
* [ImplicitNullCheck] Extend Implicit Null Check scope by using storesSanjoy Das2017-02-071-19/+23
| | | | | | | | | | | | | | | | | | | | | Summary: This change allows usage of store instruction for implicit null check. Memory Aliasing Analisys is not used and change conservatively supposes that any store and load may access the same memory. As a result re-ordering of store-store, store-load and load-store is prohibited. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: atrick, llvm-commits Differential Revision: https://reviews.llvm.org/D29400 llvm-svn: 294338
* X86: Produce @ABS8 symbol modifiers for absolute symbols in range [0,128).Peter Collingbourne2017-02-021-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D28689 llvm-svn: 293844
* [X86] Implement -mfentryNirav Dave2017-01-311-0/+16
| | | | | | | | | | | | Summary: Insert calls to __fentry__ at function entry. Reviewers: hfinkel, craig.topper Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D28000 llvm-svn: 293648
* Cleanup dump() functions.Matthias Braun2017-01-281-1/+1
| | | | | | | | | | | | | | | | | | We had various variants of defining dump() functions in LLVM. Normalize them (this should just consistently implement the things discussed in http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html For reference: - Public headers should just declare the dump() method but not use LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) - The definition of a dump method should look like this: #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP) LLVM_DUMP_METHOD void MyClass::dump() { // print stuff to dbgs()... } #endif llvm-svn: 293359
* [XRay] Merge instrumentation point table emission code into AsmPrinter.Dean Michael Berris2017-01-031-50/+0
| | | | | | | | | | | | | | | | | | Summary: No need to have this per-architecture. While there, unify 32-bit ARM's behaviour with what changed elsewhere and start function names lowercase as per the coding standards. Individual entry emission code goes to the entry's own class. Fully tested on amd64, cross-builds on both ARMs and PowerPC. Reviewers: dberris Subscribers: aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D28209 llvm-svn: 290858
* This is a large patch for X86 AVX-512 of an optimization for reducing code ↵Gadi Haber2016-12-281-0/+8
| | | | | | | | | | | | size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible. There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers. The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled. Reviewers: Craig Topper, Zvi Rackoover, Elena Demikhovsky Differential Revision: https://reviews.llvm.org/D27901 llvm-svn: 290663
* [X86] Invert an 'if' and early out to fix a weird indentation. NFCICraig Topper2016-11-251-1/+2
| | | | llvm-svn: 287909
* [X86] Size a SmallVector to the worst case mask size for a 512-bit shuffle. NFCICraig Topper2016-11-251-1/+1
| | | | llvm-svn: 287908
* [xray] Add XRay support for Mach-O in CodeGenKuba Mracek2016-11-231-26/+34
| | | | | | | | Currently, XRay only supports emitting the XRay table (xray_instr_map) on ELF binaries. Let's add Mach-O support. Differential Revision: https://reviews.llvm.org/D26983 llvm-svn: 287734
* [X86][AVX512] Add mask/maskz writemask support to constant pool shuffle ↵Simon Pilgrim2016-10-181-24/+36
| | | | | | decode commentx llvm-svn: 284488
* [AVX-512] Add support for decoding shuffle mask from constant pool for ↵Craig Topper2016-10-181-24/+53
| | | | | | masked VPERMILPS/PD. llvm-svn: 284450
* [X86] Fix shuffle decoding assertions to print the right number of required ↵Craig Topper2016-10-171-8/+8
| | | | | | operands. Update the checks themselves to be >= to the same number instead of > one less than the required number. llvm-svn: 284365
* Win64: Don't emit unwind info for "leaf" functions (PR30337)Hans Wennborg2016-09-221-0/+8
| | | | | | | | | | | | According to MSDN (see the PR), functions which don't touch any callee-saved registers (including %rsp) don't need any unwind info. This patch makes LLVM not emit unwind info for such functions, to save binary size. Differential Revision: https://reviews.llvm.org/D24748 llvm-svn: 282185
* [XRay] ARM 32-bit no-Thumb support in LLVMDean Michael Berris2016-09-191-10/+0
| | | | | | | | | | | | This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: https://reviews.llvm.org/D23932 (Clang test) https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 281878
* Remove unused function getMang().Eric Christopher2016-09-161-3/+0
| | | | llvm-svn: 281707
* X86: Fold tail calls into conditional branches also for 64-bit (PR26302)Hans Wennborg2016-09-091-0/+2
| | | | | | | | | This extends the optimization in r280832 to also work for 64-bit. The only quirk is that we can't do this for 64-bit Windows (yet). Differential Revision: https://reviews.llvm.org/D24423 llvm-svn: 281113
* Win64: Don't use REX prefix for direct tail callsHans Wennborg2016-09-081-1/+0
| | | | | | | | | | The REX prefix should be used on indirect jmps, but not direct ones. For direct jumps, the unwinder looks at the offset to determine if it's inside the current function. Differential Revision: https://reviews.llvm.org/D24359 llvm-svn: 281003
* Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"Renato Golin2016-09-081-0/+10
| | | | | | | | | | And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967
* [XRay] ARM 32-bit no-Thumb support in LLVMDean Michael Berris2016-09-081-10/+0
| | | | | | | | | | | | This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: 1. https://reviews.llvm.org/D23932 (Clang test) 2. https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 280888
* X86: Fold tail calls into conditional branches where possible (PR26302)Hans Wennborg2016-09-071-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When branching to a block that immediately tail calls, it is possible to fold the call directly into the branch if the call is direct and there is no stack adjustment, saving one byte. Example: define void @f(i32 %x, i32 %y) { entry: %p = icmp eq i32 %x, %y br i1 %p, label %bb1, label %bb2 bb1: tail call void @foo() ret void bb2: tail call void @bar() ret void } before: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne .LBB0_2 jmp foo .LBB0_2: jmp bar after: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne bar .LBB0_1: jmp foo I don't expect any significant size savings from this (on a Clang bootstrap I saw 288 bytes), but it does make the code a little tighter. This patch only does 32-bit, but 64-bit would work similarly. Differential Revision: https://reviews.llvm.org/D24108 llvm-svn: 280832
* [XRay] Detect and emit sleds for sibling/tail callsDean Michael Berris2016-09-011-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change promotes the 'isTailCall(...)' member function to TargetInstrInfo as a query interface for determining on a per-target basis whether a given MachineInstr is a tail call instruction. We build upon this in the XRay instrumentation pass to emit special sleds for tail call optimisations, where we emit the correct kind of sled. The tail call sleds look like a mix between the function entry and function exit sleds. Form-wise, the sled comes before the "jmp" instruction that implements the tail call similar to how we do it for the function entry sled. Functionally, because we know this is a tail call, it behaves much like an exit sled -- i.e. at runtime we may use the exit trampolines instead of a different kind of trampoline. A follow-up change to recognise these sleds will be done in compiler-rt, so that we can start intercepting these initially as exits, but also have the option to have different log entries to more accurately reflect that this is actually a tail call. Reviewers: echristo, rSerge, majnemer Subscribers: mehdi_amini, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D23986 llvm-svn: 280334
* [stackmaps] More extraction of common code [NFCI]Philip Reames2016-08-231-3/+2
| | | | | | General cleanup before starting to work on the part I want to actually change. llvm-svn: 279586
* [XRay] Synthesize a reference to the xray_instr_mapDean Michael Berris2016-08-191-0/+12
| | | | | | | | | | | | | | | | | | | | | Without the synthesized reference to a symbol in the xray_instr_map, linker section garbage collection will helpfully remove the whole xray_instr_map section from the final executable (or archive). This will cause the runtime to not be able to identify the sleds and hot-patch the calls/jumps into the runtime trampolines. This change adds a reference from the text section at the end of the function to keep around the associated xray_instr_map section as well. We also make sure that we catch this reference in the test. Reviewers: chandlerc, echristo, majnemer, mehdi_amini Subscribers: mehdi_amini, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D23398 llvm-svn: 279204
* [XRay] Test for xray_instr_map in object file. (NFC)Dean Michael Berris2016-08-091-1/+0
| | | | | | | | This makes a trivial change in the emission of the per-function XRay tables, and makes sure that the xray_instr_map section does show up in the object file. llvm-svn: 278113
* [XRay] Align entry and return sleds to 2 byte boundariesDean Michael Berris2016-08-041-2/+4
| | | | | | | | | | | | | | | | | | | | This should ensure that we can atomically write two bytes (on top of the retq and the one past it) and have those two bytes not straddle cache lines. We also move the label past the alignment instruction so that we can refer to the actual first instruction, as opposed to potential padding before the aligned instruction. Update the tests to allow us to reflect the new order of assembly. Reviewers: rSerge, echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23101 llvm-svn: 277701
* [XRay] Make the xray_instr_map section specification more correctDean Michael Berris2016-08-031-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: We also add a test to show what currently happens when we create a section per function and emit an xray_instr_map. This illustrates the relationship (or lack thereof) between the per-function section and the xray_instr_map section. We also change the code generation slightly so that we don't always create group sections, but rather only do so if a function where the table is associated with is in a group. Also in this change: - Remove the "merge" flag on the xray_instr_map section. - Test that we're generating the right table for comdat and non-comdat functions. Reviewers: echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23104 llvm-svn: 277580
* XRay: Add entry and exit sledsDean Michael Berris2016-07-141-0/+105
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In this patch we implement the following parts of XRay: - Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture. There are some caveats here: 1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet. 2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library. Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D19904 llvm-svn: 275367
* [X86][AVX512] Add support for VPERMILPD/VPERMILPS variable shuffle mask commentsSimon Pilgrim2016-07-131-9/+26
| | | | llvm-svn: 275272
* X86: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-121-2/+4
| | | | | | | | Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr*, mainly by preferring MachineInstr& over MachineInstr* and using range-based for loops. llvm-svn: 275149
* Drop support for creating $stubs.Rafael Espindola2016-06-291-27/+0
| | | | | | They are created by ld64 since OS X 10.5. llvm-svn: 274130
* doesSetDirectiveSuppressesReloc -> doesSetDirectiveSuppressReloc, theJoerg Sonnenberger2016-06-181-1/+1
| | | | | | former is grammatically incorrect. llvm-svn: 273100
* Avoid copies of std::strings and APInt/APFloats where we only read from itBenjamin Kramer2016-06-081-1/+1
| | | | | | | | As suggested by clang-tidy's performance-unnecessary-copy-initialization. This can easily hit lifetime issues, so I audited every change and ran the tests under asan, which came back clean. llvm-svn: 272126
OpenPOWER on IntegriCloud