summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86MCInstLower.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Win64: Don't use REX prefix for direct tail callsHans Wennborg2016-09-081-1/+0
| | | | | | | | | | The REX prefix should be used on indirect jmps, but not direct ones. For direct jumps, the unwinder looks at the offset to determine if it's inside the current function. Differential Revision: https://reviews.llvm.org/D24359 llvm-svn: 281003
* Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"Renato Golin2016-09-081-0/+10
| | | | | | | | | | And associated commits, as they broke the Thumb bots. This reverts commit r280935. This reverts commit r280891. This reverts commit r280888. llvm-svn: 280967
* [XRay] ARM 32-bit no-Thumb support in LLVMDean Michael Berris2016-09-081-10/+0
| | | | | | | | | | | | This is a port of XRay to ARM 32-bit, without Thumb support yet. The XRay instrumentation support is moving up to AsmPrinter. This is one of 3 commits to different repositories of XRay ARM port. The other 2 are: 1. https://reviews.llvm.org/D23932 (Clang test) 2. https://reviews.llvm.org/D23933 (compiler-rt) Differential Revision: https://reviews.llvm.org/D23931 llvm-svn: 280888
* X86: Fold tail calls into conditional branches where possible (PR26302)Hans Wennborg2016-09-071-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When branching to a block that immediately tail calls, it is possible to fold the call directly into the branch if the call is direct and there is no stack adjustment, saving one byte. Example: define void @f(i32 %x, i32 %y) { entry: %p = icmp eq i32 %x, %y br i1 %p, label %bb1, label %bb2 bb1: tail call void @foo() ret void bb2: tail call void @bar() ret void } before: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne .LBB0_2 jmp foo .LBB0_2: jmp bar after: f: movl 4(%esp), %eax cmpl 8(%esp), %eax jne bar .LBB0_1: jmp foo I don't expect any significant size savings from this (on a Clang bootstrap I saw 288 bytes), but it does make the code a little tighter. This patch only does 32-bit, but 64-bit would work similarly. Differential Revision: https://reviews.llvm.org/D24108 llvm-svn: 280832
* [XRay] Detect and emit sleds for sibling/tail callsDean Michael Berris2016-09-011-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change promotes the 'isTailCall(...)' member function to TargetInstrInfo as a query interface for determining on a per-target basis whether a given MachineInstr is a tail call instruction. We build upon this in the XRay instrumentation pass to emit special sleds for tail call optimisations, where we emit the correct kind of sled. The tail call sleds look like a mix between the function entry and function exit sleds. Form-wise, the sled comes before the "jmp" instruction that implements the tail call similar to how we do it for the function entry sled. Functionally, because we know this is a tail call, it behaves much like an exit sled -- i.e. at runtime we may use the exit trampolines instead of a different kind of trampoline. A follow-up change to recognise these sleds will be done in compiler-rt, so that we can start intercepting these initially as exits, but also have the option to have different log entries to more accurately reflect that this is actually a tail call. Reviewers: echristo, rSerge, majnemer Subscribers: mehdi_amini, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D23986 llvm-svn: 280334
* [stackmaps] More extraction of common code [NFCI]Philip Reames2016-08-231-3/+2
| | | | | | General cleanup before starting to work on the part I want to actually change. llvm-svn: 279586
* [XRay] Synthesize a reference to the xray_instr_mapDean Michael Berris2016-08-191-0/+12
| | | | | | | | | | | | | | | | | | | | | Without the synthesized reference to a symbol in the xray_instr_map, linker section garbage collection will helpfully remove the whole xray_instr_map section from the final executable (or archive). This will cause the runtime to not be able to identify the sleds and hot-patch the calls/jumps into the runtime trampolines. This change adds a reference from the text section at the end of the function to keep around the associated xray_instr_map section as well. We also make sure that we catch this reference in the test. Reviewers: chandlerc, echristo, majnemer, mehdi_amini Subscribers: mehdi_amini, llvm-commits, dberris Differential Revision: https://reviews.llvm.org/D23398 llvm-svn: 279204
* [XRay] Test for xray_instr_map in object file. (NFC)Dean Michael Berris2016-08-091-1/+0
| | | | | | | | This makes a trivial change in the emission of the per-function XRay tables, and makes sure that the xray_instr_map section does show up in the object file. llvm-svn: 278113
* [XRay] Align entry and return sleds to 2 byte boundariesDean Michael Berris2016-08-041-2/+4
| | | | | | | | | | | | | | | | | | | | This should ensure that we can atomically write two bytes (on top of the retq and the one past it) and have those two bytes not straddle cache lines. We also move the label past the alignment instruction so that we can refer to the actual first instruction, as opposed to potential padding before the aligned instruction. Update the tests to allow us to reflect the new order of assembly. Reviewers: rSerge, echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23101 llvm-svn: 277701
* [XRay] Make the xray_instr_map section specification more correctDean Michael Berris2016-08-031-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: We also add a test to show what currently happens when we create a section per function and emit an xray_instr_map. This illustrates the relationship (or lack thereof) between the per-function section and the xray_instr_map section. We also change the code generation slightly so that we don't always create group sections, but rather only do so if a function where the table is associated with is in a group. Also in this change: - Remove the "merge" flag on the xray_instr_map section. - Test that we're generating the right table for comdat and non-comdat functions. Reviewers: echristo, majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23104 llvm-svn: 277580
* XRay: Add entry and exit sledsDean Michael Berris2016-07-141-0/+105
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In this patch we implement the following parts of XRay: - Supporting a function attribute named 'function-instrument' which currently only supports 'xray-always'. We should be able to use this attribute for other instrumentation approaches. - Supporting a function attribute named 'xray-instruction-threshold' used to determine whether a function is instrumented with a minimum number of instructions (IR instruction counts). - X86-specific nop sleds as described in the white paper. - A machine function pass that adds the different instrumentation marker instructions at a very late stage. - A way of identifying which return opcode is considered "normal" for each architecture. There are some caveats here: 1) We don't handle PATCHABLE_RET in platforms other than x86_64 yet -- this means if IR used PATCHABLE_RET directly instead of a normal ret, instruction lowering for that platform might do the wrong thing. We think this should be handled at instruction selection time to by default be unpacked for platforms where XRay is not availble yet. 2) The generated section for X86 is different from what is described from the white paper for the sole reason that LLVM allows us to do this neatly. We're taking the opportunity to deviate from the white paper from this perspective to allow us to get richer information from the runtime library. Reviewers: sanjoy, eugenis, kcc, pcc, echristo, rnk Subscribers: niravd, majnemer, atrick, rnk, emaste, bmakam, mcrosier, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D19904 llvm-svn: 275367
* [X86][AVX512] Add support for VPERMILPD/VPERMILPS variable shuffle mask commentsSimon Pilgrim2016-07-131-9/+26
| | | | llvm-svn: 275272
* X86: Avoid implicit iterator conversions, NFCDuncan P. N. Exon Smith2016-07-121-2/+4
| | | | | | | | Avoid implicit conversions from MachineInstrBundleIterator to MachineInstr*, mainly by preferring MachineInstr& over MachineInstr* and using range-based for loops. llvm-svn: 275149
* Drop support for creating $stubs.Rafael Espindola2016-06-291-27/+0
| | | | | | They are created by ld64 since OS X 10.5. llvm-svn: 274130
* doesSetDirectiveSuppressesReloc -> doesSetDirectiveSuppressReloc, theJoerg Sonnenberger2016-06-181-1/+1
| | | | | | former is grammatically incorrect. llvm-svn: 273100
* Avoid copies of std::strings and APInt/APFloats where we only read from itBenjamin Kramer2016-06-081-1/+1
| | | | | | | | As suggested by clang-tidy's performance-unnecessary-copy-initialization. This can easily hit lifetime issues, so I audited every change and ran the tests under asan, which came back clean. llvm-svn: 272126
* [X86][XOP] Added VPERMIL2PD/VPERMIL2PS shuffle mask comment decodingSimon Pilgrim2016-06-041-0/+34
| | | | llvm-svn: 271809
* Simplify handling of hidden stub.Rafael Espindola2016-05-171-14/+0
| | | | | | | | | Since r207518 they are printed exactly like non-hidden stubs on x86 and since r207517 on ARM. This means we can use a single set for all stubs in those platforms. llvm-svn: 269776
* [X86][SSE] Added TODO comment to add support for AVX512 mask registers to ↵Simon Pilgrim2016-05-091-0/+1
| | | | | | | | shuffle comments This came up in discussion on D19198 llvm-svn: 268915
* [X86] Model FAULTING_LOAD_OP as a terminator and branch.Quentin Colombet2016-05-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327
* [X86] Use nested switches to vary the operand to helper functions that were ↵Craig Topper2016-04-291-43/+74
| | | | | | previously called in multiple cases. This seems to help the inliner reduce code. NFC llvm-svn: 267964
* [MC] Silence warning due to unused variable in !Debug builds.Davide Italiano2016-04-201-0/+1
| | | | llvm-svn: 266901
* [MC] EmitNop: Make an assertion more useful.Davide Italiano2016-04-201-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19334 llvm-svn: 266895
* [X86] Simplify StackMapShadowTracker; NFCSanjoy Das2016-04-191-33/+22
| | | | | | | | - Elide trivial contructor and desctructor - Move implementation out of an unnecessary explicit llvm namespace scope llvm-svn: 266794
* [X86MCInstLower] Clean up EmitNops; NFCSanjoy Das2016-04-191-51/+68
| | | | | | | Instead of having a conditional assert inside EmitNops, refactor so that the caller can have the assert instead. llvm-svn: 266793
* Introduce a "patchable-function" function attributeSanjoy Das2016-04-191-12/+53
| | | | | | | | | | | | | | | | | Summary: The `"patchable-function"` attribute can be used by an LLVM client to influence LLVM's code generation in ways that makes the generated code easily patchable at runtime (for instance, to redirect control). Right now only one patchability scheme is supported, `"prologue-short-redirect"`, but this can be expanded in the future. Reviewers: joker.eph, rnk, echristo, dberris Subscribers: joker.eph, echristo, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19046 llvm-svn: 266715
* [X86][XOP] Support for VPPERM 2-input shuffle mask decodingSimon Pilgrim2016-04-091-27/+56
| | | | | | | | | | This patch adds support for decoding XOP VPPERM instruction when it represents a basic shuffle. The mask decoding required the existing MCInstrLowering code to be updated to support binary shuffles - the implementation now matches what is done in X86InstrComments.cpp. Differential Revision: http://reviews.llvm.org/D18441 llvm-svn: 265874
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-1/+1
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* [X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. ↵Craig Topper2016-01-051-4/+0
| | | | | | It was only needed for rematerialization. llvm-svn: 256818
* [X86] Move shuffle decoding for constant pool into the X86CodeGen library to ↵Craig Topper2015-12-311-0/+1
| | | | | | remove a layering violation in the Util library. llvm-svn: 256680
* [X86] Add support for printing shuffle comments for AVX512 PSHUFB instructions.Craig Topper2015-12-261-8/+37
| | | | llvm-svn: 256452
* [X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the ↵Craig Topper2015-12-261-1/+8
| | | | | | Constant type not matching due to folding in the constant pool and to get VPERMILPD correct. llvm-svn: 256433
* [X86] MOVPC32r should only emit CFI adjustments when neededMichael Kuperstein2015-12-151-4/+5
| | | | | | | | | We only want to emit CFI adjustments when actually using DWARF. This fixes PR25828. Differential Revision: http://reviews.llvm.org/D15522 llvm-svn: 255664
* [X86] Always generate precise CFA adjustments.Michael Kuperstein2015-12-061-2/+4
| | | | | | | | | | This removes the code path that generate "synchronous" (only correct at call site) CFA. We will probably want to re-introduce it once we are capable of emitting different .eh_frame and .debug_frame sections. Differential Revision: http://reviews.llvm.org/D14948 llvm-svn: 254874
* [X86] Part 1 to fix x86-64 fp128 calling convention.Chih-Hung Hsieh2015-12-031-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost all these changes are conditioned and only apply to the new x86-64 f128 type configuration, which will be enabled in a follow up patch. They are required together to make new f128 work. If there is any error, we should fix or revert them as a whole. These changes should have no impact to current configurations. * Relax type legalization checks to accept new f128 type configuration, whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has TLI.isTypeLegal true. * Relax GetSoftenedFloat to return in some cases f128 type SDValue, which is TLI.isTypeLegal but not "softened" to i128 node. * Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration, to generate optimized bitwise operators for libm functions. * Enhance related Lower* functions to handle f128 type. * Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions to keep new f128 type in register, and convert f128 operators to library calls. * Fix Combiner, Emitter, Legalizer routines that did not handle f128 type. * Add ExpandConstant to handle i128 constants, ExpandNode to handle ISD::Constant node. * Add one more parameter to getCommonSubClass and firstCommonClass, to guarantee that returned common sub class will contain the specified simple value type. This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp. * Fix infinite loop in getTypeLegalizationCost when f128 is the value type. * Fix printOperand to handle null operand. * Enhance ISD::BITCAST node to handle f128 constant. * Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes. * Enhance X86AsmPrinter to emit f128 values in comments. Differential Revision: http://reviews.llvm.org/D15134 llvm-svn: 254653
* Add cfi instr for CFA calculation when movpc is expanded to call and popPetar Jovanovic2015-11-051-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the issue of wrong CFA calculation in the following case: 0x08048400 <+0>: push %ebx 0x08048401 <+1>: sub $0x8,%esp 0x08048404 <+4>: **call 0x8048409 <test+9>** 0x08048409 <+9>: **pop %eax** 0x0804840a <+10>: add $0x1bf7,%eax 0x08048410 <+16>: mov %eax,%ebx 0x08048412 <+18>: call 0x80483f0 <bar> 0x08048417 <+23>: add $0x8,%esp 0x0804841a <+26>: pop %ebx 0x0804841b <+27>: ret The highlighted instructions are a product of movpc instruction. The call instruction changes the stack pointer, and pop instruction restores its value. However, the rule for computing CFA is not updated and is wrong on the pop instruction. So, e.g. backtrace in gdb does not work when on the pop instruction. This adds cfi instructions for both call and pop instructions. cfi_adjust_cfa_offset** instruction is used with the appropriate offset for setting the rules to calculate CFA correctly. Patch by Violeta Vukobrat. Differential Revision: http://reviews.llvm.org/D14021 llvm-svn: 252176
* [X86] Add support to assembler and MCInst lowering to use the other vmovq ↵Craig Topper2015-10-121-12/+14
| | | | | | %xmmX, %xmmX encoding if it would be a shorter VEX encoding. llvm-svn: 250014
* [WinEH] Make FuncletLayout more robust against catchretDavid Majnemer2015-10-011-0/+29
| | | | | | | | | Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. llvm-svn: 249052
* [WinEH] Make funclet return instrs pseudo instrsReid Kleckner2015-09-171-7/+0
| | | | | | | | | This makes catchret look more like a branch, and less like a weird use of BlockAddress. It also lets us get away from llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label arithmetic. llvm-svn: 247936
* [WinEH] Emit prologues and epilogues for funcletsReid Kleckner2015-09-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092
* AVX-512: Lowering for 512-bit vector shuffles.Elena Demikhovsky2015-09-081-20/+31
| | | | | | | | Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer. Differential Revision: http://reviews.llvm.org/D10683 llvm-svn: 246981
* [WinEH] Add some support for code generating catchpadReid Kleckner2015-08-271-0/+6
| | | | | | | We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235
* Remove and forbid raw_svector_ostream::flush() calls.Yaron Keren2015-08-131-1/+0
| | | | | | | | | | After r244870 flush() will only compare two null pointers and return, doing nothing but wasting run time. The call is not required any more as the stream and its SmallString are always in sync. Thanks to David Blaikie for reviewing. llvm-svn: 244928
* x86 atomic: optimize a.store(reg op a.load(acquire), release)JF Bastien2015-08-051-0/+12
| | | | | | | | | | | | Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations. Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11382 llvm-svn: 244128
* [ImplicitNullChecks] Work with implicit defs.Sanjoy Das2015-07-201-1/+4
| | | | | | | | | | | | | | | Summary: This change generalizes the implicit null checks pass to work with instructions that don't have any explicit register defs. This lets us use X86's `cmp` against memory as faulting load instructions. Reviewers: reames, JosephTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11286 llvm-svn: 242703
* Move most user of TargetMachine::getDataLayout to the Module oneMehdi Amini2015-07-161-3/+3
| | | | | | | | | | | | | | | | | | | | | Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. This patch is quite boring overall, except for some uglyness in ASMPrinter which has a getDataLayout function but has some clients that use it without a Module (llmv-dsymutil, llvm-dwarfdump), so some methods are taking a DataLayout as parameter. Reviewers: echristo Subscribers: yaron.keren, rafael, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11090 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 242386
* Simplify the Mangler interface now that DataLayout is mandatory.Rafael Espindola2015-06-231-1/+1
| | | | | | | We only need to pass in a DataLayout when mangling a raw string, not when constructing the mangler. llvm-svn: 240405
* Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)Alexander Kornienko2015-06-231-1/+1
| | | | | | Apparently, the style needs to be agreed upon first. llvm-svn: 240390
* Avoid a Symbol -> Name -> Symbol conversion.Rafael Espindola2015-06-221-5/+3
| | | | | | | | | | | | | | Before this we were producing a TargetExternalSymbol from a MCSymbol. That meant extracting the symbol name and fetching the symbol again down the pipeline. This patch adds a DAG.getMCSymbol that lets the MCSymbol pass unchanged on the DAG. Doing so removes the need for MO_NOPREFIX and fixes the root cause of pr23900, allowing r240130 to be committed again. llvm-svn: 240300
* Fixed/added namespace ending comments using clang-tidy. NFCAlexander Kornienko2015-06-191-1/+1
| | | | | | | | | | | | | The patch is generated using this command: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/ Thanks to Eugene Kosov for the original patch! llvm-svn: 240137
OpenPOWER on IntegriCloud