summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* Replace several 'assert(false' with 'llvm_unreachable' or fold a condition ↵Craig Topper2015-01-052-14/+5
| | | | | | into the assert. llvm-svn: 225160
* [X86] Remove the predicates from the register forms of the 2-byte inc and ↵Craig Topper2015-01-052-44/+24
| | | | | | dec instructions. Remove the 32-bit mode only versions that existed for the disassembler. Move the patterns out of the instructions so they can still be qualified with predicates. llvm-svn: 225157
* [X86] Simplify code a little by just summing flags instead of conditionally ↵Craig Topper2015-01-051-18/+7
| | | | | | incrementing. NFC llvm-svn: 225156
* [X86] Remove unnecessary redeclaration of a variable with the same ↵Craig Topper2015-01-051-1/+0
| | | | | | assignment as the beginning of the function. NFC. llvm-svn: 225155
* [X86] Remove a strange fixme referring to a hack that doesn't seem to exist ↵Craig Topper2015-01-051-3/+0
| | | | | | since the code is in a comment. Can't figure out what the body of the 'if' was supposed to be anyway. llvm-svn: 225154
* [x86] Reduce text duplication for similar operand class declarations in ↵Craig Topper2015-01-051-268/+178
| | | | | | tablegen instruction info. No functional change intended. llvm-svn: 225153
* [X86] Fix the immediate size to match the address size in the operand types ↵Craig Topper2015-01-051-7/+7
| | | | | | for the move to/from absolute memory instructions. llvm-svn: 225152
* Minor cleanup to all the switches after MatchInstructionImpl in all the ↵Craig Topper2015-01-031-1/+1
| | | | | | | | AsmParsers. Make sure they all have llvm_unreachable on the default path out of the switch. Remove unnecessary "default: break". Remove a 'return' after unreachable. Fix some indentation. llvm-svn: 225114
* [X86] Disassembler support for move to/from %rax with a 32-bit memory offset ↵Craig Topper2015-01-033-0/+22
| | | | | | is REX.W and AdSize prefix are both present. llvm-svn: 225099
* [X86] Use 32-bit sign extended immediate for 64-bit LOCK_ArithBinOp with ↵Craig Topper2015-01-031-6/+6
| | | | | | sign extended immediate. llvm-svn: 225098
* Improved comments. No functional change intended.Andrea Di Biagio2015-01-021-2/+2
| | | | llvm-svn: 225080
* [X86] Bring some better consistency to the naming of the move to/from ↵Craig Topper2015-01-022-44/+41
| | | | | | %al/ax/eax/rax with memory offset. llvm-svn: 225078
* [X86] Make the instructions that use AdSize16/32/64 co-exist together ↵Craig Topper2015-01-025-144/+241
| | | | | | | | | | without using mode predicates. This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used. Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction. llvm-svn: 225075
* Add r224985 back with a fix.Rafael Espindola2014-12-312-83/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issues was that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. Original message: Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225048
* Revert "Remove doesSectionRequireSymbols."Rafael Espindola2014-12-312-46/+83
| | | | | | | | This reverts commit r224985. I am investigating why it made an Apple bot unhappy. llvm-svn: 225044
* [X86] Fix disassembly of absolute moves to work correctly in 16 and 32-bit ↵Craig Topper2014-12-312-6/+35
| | | | | | modes with all 4 combinations of OpSize and AdSize prefixes being present or not. llvm-svn: 225036
* [x86] Simplify detection of jcxz/jecxz/jrcxz in disassembler.Craig Topper2014-12-311-16/+5
| | | | llvm-svn: 225035
* x86_64: Fix calls to __morestack under the large code model.Peter Collingbourne2014-12-301-6/+30
| | | | | | | | | | | | | | | | | | | | Under the large code model, we cannot assume that __morestack lives within 2^31 bytes of the call site, so we cannot use pc-relative addressing. We cannot perform the call via a temporary register, as the rax register may be used to store the static chain, and all other suitable registers may be either callee-save or used for parameter passing. We cannot use the stack at this point either because __morestack manipulates the stack directly. To avoid these issues, perform an indirect call via a read-only memory location containing the address. This solution is not perfect, as it assumes that the .rodata section is laid out within 2^31 bytes of each function body, but this seems to be sufficient for JIT. Differential Revision: http://reviews.llvm.org/D6787 llvm-svn: 225003
* Remove doesSectionRequireSymbols.Rafael Espindola2014-12-302-83/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 224985
* [X86] Fix some cases where some 8-bit instructions were marked as being ↵Craig Topper2014-12-291-18/+24
| | | | | | convertible to three address instructions, but aren't really. llvm-svn: 224940
* [X86] Add the 0x82 instructions to the disassebmler. They are identical in ↵Craig Topper2014-12-291-6/+35
| | | | | | functionality to the 0x80 opcode instructions, but are not valid in 64-bit mode. llvm-svn: 224939
* [x86] Refactor some tablegen instruction info classes slightly to prepare ↵Craig Topper2014-12-291-29/+28
| | | | | | for another change. NFC. llvm-svn: 224938
* [x86] Remove unused classes from tablegen instruction info.Craig Topper2014-12-291-23/+0
| | | | llvm-svn: 224937
* Add segmented stack support for DragonFlyBSD.Rafael Espindola2014-12-292-3/+12
| | | | | | Patch by Michael Neumann. llvm-svn: 224936
* Refactor duplicated code.Rafael Espindola2014-12-292-21/+2
| | | | | | No intended functionality change. llvm-svn: 224935
* [X86][ISel] Fix a regression I introduced in r224884Keno Fischer2014-12-281-3/+3
| | | | | | | | | | | | | The else case ResultReg was not checked for validity. To my surprise, this case was not hit in any of the existing test cases. This includes a new test cases that tests this path. Also drop the `target triple` declaration from the original test as suggested by H.J. Lu, because apparently with it the test won't be run on Linux llvm-svn: 224901
* [X86] Add missing memory variants to AVX false dependency breakingMichael Kuperstein2014-12-281-2/+26
| | | | | | | | Adds missing memory instruction variants to AVX false dependency breaking handling. (SSE was handled in r224246) Differential Revision: http://reviews.llvm.org/D6780 llvm-svn: 224900
* [CodeGenPrepare] Teach when it is profitable to speculate calls to ↵Andrea Di Biagio2014-12-282-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | @llvm.cttz/ctlz. If the control flow is modelling an if-statement where the only instruction in the 'then' basic block (excluding the terminator) is a call to cttz/ctlz, CodeGenPrepare can try to speculate the cttz/ctlz call and simplify the control flow graph. Example: \code entry: %cmp = icmp eq i64 %val, 0 br i1 %cmp, label %end.bb, label %then.bb then.bb: %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true) br label %end.bb end.bb: %cond = phi i64 [ %c, %then.bb ], [ 64, %entry] \code In this example, basic block %then.bb is taken if value %val is not zero. Also, the phi node in %end.bb would propagate the size-of in bits of %val only if %val is equal to zero. With this patch, CodeGenPrepare will try to hoist the call to cttz from %then.bb into basic block %entry only if cttz is cheap to speculate for the target. Added two new hooks in TargetLowering.h to let targets customize the behavior (i.e. decide whether it is cheap or not to speculate calls to cttz/ctlz). The two new methods are 'isCheapToSpeculateCtlz' and 'isCheapToSpeculateCttz'. By default, both methods return 'false'. On X86, method 'isCheapToSpeculateCtlz' returns true only if the target has LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target has BMI. Differential Revision: http://reviews.llvm.org/D6728 llvm-svn: 224899
* [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics ↵Craig Topper2014-12-272-22/+23
| | | | | | with illegal immediates. Correctly this time. I did the wrong patterns the first time. llvm-svn: 224891
* Fixing another -Wunused-variable warning, this time in release builds ↵Aaron Ballman2014-12-271-3/+3
| | | | | | without asserts. NFC. llvm-svn: 224889
* Removing a variable that is set but never used, to silence a ↵Aaron Ballman2014-12-271-4/+0
| | | | | | -Wunused-but-set-variable warning; NFC. llvm-svn: 224888
* [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics ↵Craig Topper2014-12-271-15/+18
| | | | | | with illegal immediates. Forgot to do this when I did SSE/SSE2/AVX/AVX2. llvm-svn: 224887
* [x86] Assert on invalid immediates in the instruction printer for ↵Craig Topper2014-12-272-4/+8
| | | | | | cmp.ps/pd/ss/sd instead of truncating the immediate. The assembly parser and instruction selection shouldn't generate invalid immediates. llvm-svn: 224886
* [x86] Prevent llvm.x86.cmp.ps/pd/ss/sd from being selected with bad ↵Craig Topper2014-12-272-26/+33
| | | | | | immediates. The frontend now checks this when the builtin is used. This will allow the instruction printer to not have to deal with invalid immediates on these instructions. llvm-svn: 224885
* [FastIsel][X86] Fix invalid register replacement for bool argsKeno Fischer2014-12-271-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Consider the following IR: %3 = load i8* undef %4 = trunc i8 %3 to i1 %5 = call %jl_value_t.0* @foo(..., i1 %4, ...) ret %jl_value_t.0* %5 Bools (that are the result of direct truncs) are lowered as whatever the argument to the trunc was and a "and 1", causing the part of the MBB responsible for this argument to look something like this: %vreg8<def,tied1> = AND8ri %vreg7<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg8,%vreg7 Later, when the load is lowered, it will insert %vreg15<def> = MOV8rm %vreg14, 1, %noreg, 0, %noreg; mem:LD1[undef] GR8:%vreg15 GR64:%vreg14 but remember to (at the end of isel) replace vreg7 by vreg15. Now for the bug. In fast isel lowering, we mistakenly mark vreg8 as the result of the load instead of the trunc. This adds a fixup to have vreg8 replaced by whatever the result of the load is as well, so we end up with %vreg15<def,tied1> = AND8ri %vreg15<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg15 which is an SSA violation and causes problems later down the road. This fixes PR21557. Test Plan: Test test case from PR21557 is added to the test suite. Reviewers: ributzka Reviewed By: ributzka Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6245 llvm-svn: 224884
* [X86] Add the debug registers DR8-DR15 so we can assemble and disassemble ↵Craig Topper2014-12-263-11/+25
| | | | | | references to them. llvm-svn: 224862
* [X86] Don't fail disassembly if REX.R/REX.B is used on an MMX register. ↵Craig Topper2014-12-262-6/+9
| | | | | | Similar fix to not fail to disassembler CR9-CR15 references. llvm-svn: 224861
* Teach disassembler to handle illegal immediates on (v)cmpps/pd/ss/sd ↵Craig Topper2014-12-264-61/+75
| | | | | | instructions. Instead of rejecting we'll just generate the _alt forms that don't try to alter the mnemonic. While I'm here, merge some common code in the Instruction printers for the condition code replacement and fix the mask on SSE to be 3-bits instead of 4. llvm-svn: 224846
* Use MCPhysReg for table of register encodings.Craig Topper2014-12-261-3/+3
| | | | llvm-svn: 224845
* Masked Load/Store - Changed the order of parameters in intrinsics.Elena Demikhovsky2014-12-251-3/+3
| | | | | | | No functional changes. The documentation is coming. llvm-svn: 224829
* [X86] Remove the single AdSize indicator and replace it with separate ↵Craig Topper2014-12-245-93/+107
| | | | | | | | AdSize16/32/64 flags. This removes a hardcoded list of instructions in the CodeEmitter. Eventually I intend to remove the predicates on the affected instructions since in any given mode two of them are valid if we supported addr32/addr16 prefixes in the assembler. llvm-svn: 224809
* AVX-512: Added FMA instructions, intrinsics an tests for KNL and SKX targetsElena Demikhovsky2014-12-233-81/+101
| | | | | | | | by Asaf Badouh http://reviews.llvm.org/D6456 llvm-svn: 224764
* AVX-512: BLENDM - fixed encoding of the broadcast versionElena Demikhovsky2014-12-232-2/+3
| | | | | | Added more intrinsics and encoding tests. llvm-svn: 224760
* X86: Don't over-align combined loads.Jim Grosbach2014-12-231-8/+3
| | | | | | | | | | | When combining consecutive loads+inserts into a single vector load, we should keep the alignment of the base load. Doing otherwise can, and does, lead to using overly aligned instructions. In the included test case, for example, using a 32-byte vmovaps on a 16-byte aligned value. Oops. rdar://19190968 llvm-svn: 224746
* Make musttail more robust for vector types on x86Reid Kleckner2014-12-222-100/+107
| | | | | | | | | | | | | | | | Previously I tried to plug musttail into the existing vararg lowering code. That turned out to be a mistake, because non-vararg calls use significantly different register lowering, even on x86. For example, AVX vectors are usually passed in registers to normal functions and memory to vararg functions. Now musttail uses a completely separate lowering. Hopefully this can be used as the basis for non-x86 perfect forwarding. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6156 llvm-svn: 224745
* [x86] Add vector @llvm.ctpop intrinsic custom loweringBruno Cardoso Lopes2014-12-221-0/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when ctpop is supported for scalar types, the expansion of @llvm.ctpop.vXiY uses vector element extractions, insertions and individual calls to @llvm.ctpop.iY. When not, expansion with bit-math operations is used for the scalar calls. Local haswell measurements show that we can improve vector @llvm.ctpop.vXiY expansion in some cases by using a using a vector parallel bit twiddling approach, based on: v = v - ((v >> 1) & 0x55555555); v = (v & 0x33333333) + ((v >> 2) & 0x33333333); v = ((v + (v >> 4) & 0xF0F0F0F) v = v + (v >> 8) v = v + (v >> 16) v = v & 0x0000003F (from http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel) When scalar ctpop isn't supported, the approach above performs better for v2i64, v4i32, v4i64 and v8i32 (see numbers below). And even when scalar ctpop is supported, this approach performs ~2x better for v8i32. Here, x86_64 implies -march=corei7-avx without ctpop and x86_64h includes ctpop support with -march=core-avx2. == [x86_64h - new] v8i32: 0.661685 v4i32: 0.514678 v4i64: 0.652009 v2i64: 0.324289 == [x86_64h - old] v8i32: 1.29578 v4i32: 0.528807 v4i64: 0.65981 v2i64: 0.330707 == [x86_64 - new] v8i32: 1.003 v4i32: 0.656273 v4i64: 1.11711 v2i64: 0.754064 == [x86_64 - old] v8i32: 2.34886 v4i32: 1.72053 v4i64: 1.41086 v2i64: 1.0244 More work for other vector types will come next. llvm-svn: 224725
* AVX-512: Added all forms of BLENDM instructions,Elena Demikhovsky2014-12-223-55/+120
| | | | | | intrinsics, encoding tests for AVX-512F and skx instructions. llvm-svn: 224707
* [X86] Add hasSideEffects = 0 to CALLpcrel16. This matches what is inferred ↵Craig Topper2014-12-211-4/+5
| | | | | | from patterns for the 32-bit version. llvm-svn: 224692
* [X86] Swap operand order in Intel syntax on a bunch of aliases.Craig Topper2014-12-201-18/+18
| | | | llvm-svn: 224687
* [X86] Swap operand order of imul aliases in Intel syntax. Also disable ↵Craig Topper2014-12-201-6/+6
| | | | | | printing of the alias instead of the real instruction. llvm-svn: 224686
OpenPOWER on IntegriCloud