summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrInfo.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Mark pseudo instruction TEST8ri_NOEREX as hasSIdeEffects=0.Akira Hatanaka2014-07-101-0/+1
| | | | | | | | | Also, add a case clause in X86InstrInfo::shouldScheduleAdjacent to enable macro-fusion. <rdar://problem/15680770> llvm-svn: 212747
* [FastISel][X86] Optimize selects when the condition comes from a compare.Juergen Ributzka2014-06-231-2/+2
| | | | | | | Optimize the select instructions sequence to use the EFLAGS directly from a compare when possible. llvm-svn: 211543
* [FastISel][X86] Refactor the code to get the X86 condition from a helper ↵Juergen Ributzka2014-06-161-3/+2
| | | | | | | | | function. NFC. Make use of helper functions to simplify the branch and compare instruction selection in FastISel. Also add test cases for compare and conditonal branch. llvm-svn: 211077
* Remove the use of TargetMachine from X86InstrInfo.Eric Christopher2014-06-101-57/+56
| | | | llvm-svn: 210596
* Move X86RegisterInfo away from using the TargetMachine and onlyEric Christopher2014-06-101-1/+1
| | | | | | using the subtarget. llvm-svn: 210595
* Add a new attribute called 'jumptable' that creates jump-instruction tables ↵Tom Roeder2014-06-051-0/+11
| | | | | | | | | | | | for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280
* Fix a use of uninitialized value. OldCC is set when IsCmpZero || IsSwapped ↵Nick Lewycky2014-06-041-1/+1
| | | | | | and read when ShouldUpdateCC || IsSwapped, and ShouldUpdateCC is independent. Fixes PR19932, but no test since I wasn't able to get any symptoms to appear, not even with valgrind and the testcase from the PR. It's clear what happened from inspection of the code. llvm-svn: 210168
* Avoid using subtarget features when adding X86 specific passes toEric Christopher2014-05-221-2/+4
| | | | | | the pass pipeline. llvm-svn: 209382
* Rename createGlobalBaseRegPass -> createX86GlobalBaseRegPass to makeEric Christopher2014-05-221-1/+1
| | | | | | it obvious that it's a target specific pass. llvm-svn: 209380
* [X86] Tune LEA usage for SilvermontAlexey Volkov2014-05-201-6/+2
| | | | | | | | | | | | | According to Intel Software Optimization Manual on Silvermont in some cases LEA is better to be replaced with ADD instructions: "The rule of thumb for ADDs and LEAs is that it is justified to use LEA with a valid index and/or displacement for non-destructive destination purposes (especially useful for stack offset cases), or to use a SCALE. Otherwise, ADD(s) are preferable." Differential Revision: http://reviews.llvm.org/D3826 llvm-svn: 209198
* X86: If we have an instruction that sets a flag and a zero test on the input ↵Benjamin Kramer2014-05-141-3/+63
| | | | | | | | | | | | | | | | | | | | | of that instruction try to eliminate the test. For example tzcntl %edi, %ebx testl %edi, %edi je .label can be rewritten into tzcntl %edi, %ebx jb .label A minor complication is that tzcnt sets CF instead of ZF when the input is zero, we have to rewrite users of the flags from ZF to CF. Currently we recognize patterns using lzcnt, tzcnt and popcnt. Differential Revision: http://reviews.llvm.org/D3454 llvm-svn: 208788
* Use X86 memory operand enums instead of hardcoding.Craig Topper2014-05-061-16/+20
| | | | llvm-svn: 208064
* [C++] Use 'nullptr'. Target edition.Craig Topper2014-04-251-66/+72
| | | | llvm-svn: 207197
* [cleanup] Lift using directives, DEBUG_TYPE definitions, and even someChandler Carruth2014-04-221-2/+2
| | | | | | | | | | | | system headers above the includes of generated '.inc' files that actually contain code. In a few targets this was already done pretty consistently, but it wasn't done *really* consistently anywhere. It is strictly cleaner IMO and necessary in a bunch of places where the DEBUG_TYPE is referenced from the generated code. Consistency with the necessary places trumps. Hopefully the build bots are OK with the movement of intrin.h... llvm-svn: 206838
* [Modules] Make Support/Debug.h modular. This requires it to not changeChandler Carruth2014-04-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | behavior based on other files defining DEBUG_TYPE, which means it cannot define DEBUG_TYPE at all. This is actually better IMO as it forces folks to define relevant DEBUG_TYPEs for their files. However, it requires all files that currently use DEBUG(...) to define a DEBUG_TYPE if they don't already. I've updated all such files in LLVM and will do the same for other upstream projects. This still leaves one important change in how LLVM uses the DEBUG_TYPE macro going forward: we need to only define the macro *after* header files have been #include-ed. Previously, this wasn't possible because Debug.h required the macro to be pre-defined. This commit removes that. By defining DEBUG_TYPE after the includes two things are fixed: - Header files that need to provide a DEBUG_TYPE for some inline code can do so by defining the macro before their inline code and undef-ing it afterward so the macro does not escape. - We no longer have rampant ODR violations due to including headers with different DEBUG_TYPE definitions. This may be mostly an academic violation today, but with modules these types of violations are easy to check for and potentially very relevant. Where necessary to suppor headers with DEBUG_TYPE, I have moved the definitions below the includes in this commit. I plan to move the rest of the DEBUG_TYPE macros in LLVM in subsequent commits; this one is big enough. The comments in Debug.h, which were hilariously out of date already, have been updated to reflect the recommended practice going forward. llvm-svn: 206822
* [X86] As per suggestion from Craig Topper and Hal Finkel, overrideLang Hames2014-04-021-40/+36
| | | | | | | | | TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather than abusing commuteInstruction. Thanks very much for the suggestion guys! llvm-svn: 205489
* [X86] Make the VFMA*231 variants commutable and relax the alignment restrictionsLang Hames2014-04-021-106/+145
| | | | | | | | | | | on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load from unaligned memory. Testcase to follow, along with related patch. <rdar://problem/16478629> llvm-svn: 205472
* AVX-512: Implemented masking for integer arithmetic & logic instructions.Elena Demikhovsky2014-03-271-2/+16
| | | | | | By Robert Khasanov rob.khasanov@gmail.com llvm-svn: 204906
* [X86] Add broadcast instructions to the table used by ExeDepsFix pass.Quentin Colombet2014-03-261-1/+7
| | | | | | | | | | | | | | | | | | | | | Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table. That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float). In particular, prior to this patch we were generating: vpbroadcastd LCPI1_0(%rip), %ymm2 vpand %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty Now, we generate the following nice sequence where everything is in the float domain: vbroadcastss LCPI1_0(%rip), %ymm2 vandps %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 <rdar://problem/16354675> llvm-svn: 204770
* Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changingOwen Anderson2014-03-131-3/+3
| | | | | | | | | | operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
* [C++11] Add 'override' keyword to virtual methods that override their base ↵Craig Topper2014-03-091-6/+6
| | | | | | class. llvm-svn: 203378
* AVX-512: fixed comressed displacement - by Robert KhazanovElena Demikhovsky2014-03-061-0/+1
| | | | llvm-svn: 203096
* [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.Benjamin Kramer2014-03-021-3/+3
| | | | | | Remove the old functions. llvm-svn: 202636
* Replace X86 FMA intrinsic pseduo-instructions with def pats.Lang Hames2014-01-311-8/+0
| | | | | | | | | | | It looks like these pseudos were only used for pattern matching. Def pats are the appropriate way to do that. As a bonus, these intrinsics will now have memory operands folded properly, and better FMA3 variants selected where appropriate (see r199933). <rdar://problem/15611947> llvm-svn: 200577
* AVX-512: added VPERM2D VPERM2Q VPERM2PS VPERM2PD instructions,Elena Demikhovsky2014-01-231-2/+4
| | | | | | they give better sequences than VPERMI llvm-svn: 199893
* AVX-512: Added more intrinsics for pmin/pmax, pabs, blend, pmuldq.Elena Demikhovsky2014-01-081-0/+4
| | | | llvm-svn: 198745
* Handle MOV32r0 in expandPostRAPseudo instead of MCInst lowering. No ↵Craig Topper2013-12-311-0/+2
| | | | | | functional change intended. llvm-svn: 198254
* AVX-512: Added legal type MVT::i1 and VK1 register for it.Elena Demikhovsky2013-12-161-8/+9
| | | | | | | | | Added scalar compare VCMPSS, VCMPSD. Implemented LowerSELECT for scalar FP operations. I replaced FSETCCss, FSETCCsd with one node type FSETCCs. Node extract_vector_elt(v16i1/v8i1, idx) returns an element of type i1. llvm-svn: 197384
* AVX-512: Changed intrinsics of VPCONFLICT to match GCC builtin formElena Demikhovsky2013-12-101-0/+16
| | | | llvm-svn: 196914
* Refactor a lot of patchpoint/stackmap related code to simplify and make itLang Hames2013-11-291-74/+0
| | | | | | | | | | | | | | | | | | | | | | target independent. Most of the x86 specific stackmap/patchpoint handling was necessitated by the use of the native address-mode format for frame index operands. PEI has now been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing us to use a simple, platform independent register/offset pair for frame indexes on stackmap/patchpoints. Notes: - Folding is now platform independent and automatically supported. - Emiting patchpoints with direct memory references now just involves calling the TargetLoweringBase::emitPatchPoint utility method from the target's XXXTargetLowering::EmitInstrWithCustomInserter method. (See X86TargetLowering for an example). - No more ugly platform-specific operand parsers. This patch shouldn't change the generated output for X86. llvm-svn: 195944
* StackMap: Implement support for DirectMemRefOp.Andrew Trick2013-11-261-10/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A Direct stack map location records the address of frame index. This address is itself the value that the runtime requested. This differs from IndirectMemRefOp locations, which refer to a stack locations from which the requested values must be loaded. Direct locations can directly communicate the address if an alloca, while IndirectMemRefOp handle register spills. For example: entry: %a = alloca i64... llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a) Since both the alloca and stackmap intrinsic are in the entry block, and the intrinsic takes the address of the alloca, the runtime can assume that LLVM will not substitute alloca with any intervening value. This must be verified by the runtime by checking that the stack map's location is a Direct location type. The runtime can then determine the alloca's relative location on the stack immediately after compilation, or at any time thereafter. This differs from Register and Indirect locations, because the runtime can only read the values in those locations when execution reaches the instruction address of the stack map. llvm-svn: 195712
* Use symbolic operands in the patchpoint folding routine and fix a spilling bug.Andrew Trick2013-11-191-8/+6
| | | | | | Fixes <rdar://15487687> [JS] AnyRegCC argument ends up being spilled llvm-svn: 195094
* [weak vtables] Remove a bunch of weak vtablesJuergen Ributzka2013-11-191-1/+4
| | | | | | | | | | | | This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064
* Revert r194865 and r194874.Alexey Samsonov2013-11-181-4/+1
| | | | | | | | | | | | This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997
* Added a size field to the stack map record to handle subregister spills.Andrew Trick2013-11-171-1/+12
| | | | | | | | Implementing this on bigendian platforms could get strange. I added a target hook, getStackSlotRange, per Jakob's recommendation to make this as explicit as possible. llvm-svn: 194942
* During folding for patchpoint/stackmap instructions, defer creation of new MIsLang Hames2013-11-151-4/+5
| | | | | | | | until we know that folding will be successful. No functional change. llvm-svn: 194880
* [weak vtables] Remove a bunch of weak vtablesJuergen Ributzka2013-11-151-1/+4
| | | | | | | | | | | This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865
* AVX-512: Handled extractelement from mask vector;Elena Demikhovsky2013-11-141-2/+4
| | | | | | Added VMOSHDUP/VMOVSLDUP shuffle instructions. llvm-svn: 194691
* Cleanup the stackmap operand folding code and fix a corner case.Andrew Trick2013-11-121-6/+12
| | | | | | | I still don't know how to refer to the fixed operands symbolically. I plan to look into it. llvm-svn: 194529
* Simplify operand folding when rematerializing a load.Andrew Trick2013-11-121-1/+6
| | | | | | | | | | | | We already know how to fold a reload from a frameindex without analyzing the load instruction. Generalize this to handle any frameindex load. This streamlines the logic for rematerializing loads from stack arguments. As a side effect, it allows stackmaps to record a stack argument location without spilling it. Verified no effect on codegen for llvm test-suite. llvm-svn: 194497
* Fix the recently added anyregcc convention to handle spilled operands.Andrew Trick2013-11-111-1/+10
| | | | | | | | | | | | Fixes <rdar://15432754> [JS] Assertion: "Folded a def to a non-store!" The primary purpose of anyregcc is to prevent a patchpoint's call arguments and return value from being spilled. They must be available in a register, although the calling convention does not pin the register. It's up to the front end to avoid using this convention for calls with more arguments than allocatable registers. llvm-svn: 194428
* [Stackmap] Add AnyReg calling convention support for patchpoint intrinsic.Juergen Ributzka2013-11-081-2/+8
| | | | | | | | | | | | | | The idea of the AnyReg Calling Convention is to provide the call arguments in registers, but not to force them to be placed in a paticular order into a specified set of registers. Instead it is up tp the register allocator to assign any register as it sees fit. The same applies to the return value (if applicable). Differential Revision: http://llvm-reviews.chandlerc.com/D2009 Reviewed by Andy llvm-svn: 194293
* Add support for stack map generation in the X86 backend.Andrew Trick2013-10-311-4/+39
| | | | | | Originally implemented by Lang Hames. llvm-svn: 193811
* Replace (V)MOVZDI2PDIrr/rm instructions with patterns that select ↵Craig Topper2013-10-221-2/+0
| | | | | | (V)MOVDI2PDIrr/rm. llvm-svn: 193146
* Fix the ExecutionDepsFix pass to handle AVX instructions.Andrew Trick2013-10-141-14/+67
| | | | | | | | | | | | | | | | | | | | | | | | This pass is needed to break false dependencies. Without it, unlucky register assignment can result in wild (5x) swings in performance. This pass was trying to handle AVX but not getting it right. AVX doesn't have partial register defs, it has unused register reads in which the high bits of a source operand are copied into the unused bits of the dest. Fixing this requires conservative liveness analysis. This is awkard because the pass already has its own pseudo-liveness. However, proper liveness is expensive, and we would like to use a generic utility to compute it. The fix only invokes liveness on-demand. It is rare to detect a case that needs undef-read dependence breaking, but when it happens, it can be needed many times within a very large block. I think the existing heuristic which uses a register window of 16 is too conservative for loop-carried false dependencies. If the loop is a reduction. The out-of-order engine may be able to execute several loop iterations in parallel. However, I'll leave this tuning exercise for next time. llvm-svn: 192635
* whitespaceAndrew Trick2013-10-141-1/+1
| | | | llvm-svn: 192633
* Remove FsMOVAPSrr and friends. They have no patterns and are no longer ↵Craig Topper2013-10-071-8/+0
| | | | | | selected anywhere. llvm-svn: 192089
* X86: Don't fold spills into SSE operations if the stack is unaligned.Benjamin Kramer2013-10-061-0/+4
| | | | | | | Regalloc can emit unaligned spills nowadays, but we can't fold the spills into SSE ops if we can't guarantee alignment. PR12250. llvm-svn: 192064
* AVX-512: added scalar convert instructions and intrinsics.Elena Demikhovsky2013-10-061-4/+5
| | | | | | Fixed load folding in VPERM2I instruction. llvm-svn: 192063
* Add TBM instructions to loading folding tables.Craig Topper2013-10-051-1/+21
| | | | llvm-svn: 192046
OpenPOWER on IntegriCloud