summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Add LLVMScalarOpts to LLVMPowerPCCodeGen.NAKAMURA Takumi2014-11-211-1/+1
| | | | llvm-svn: 222516
* DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same ↵Hao Liu2014-11-212-0/+7
| | | | | | | | | | | | divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510
* Remove a bunch of unnecessary typecasts to 'const TargetRegisterClass *'Craig Topper2014-11-217-67/+38
| | | | llvm-svn: 222509
* [PPC] Use SeparateConstOffsetFromGEPHal Finkel2014-11-211-0/+20
| | | | | | | | | | | | This mirrors r222331, which enabled SeparateConstOffsetFromGEP on AArch64, in the PowerPC backend. Yields, on a POWER7 machine, a 30% speedup on SingleSource/Benchmarks/Shootout/nestedloop (this might just be from LICM, there is a store moved out of the inner loop) and a potential speedup on MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode. Regardless, it makes some code look cleaner, and synchronizing the backends in this regard seems like a generally good thing. llvm-svn: 222504
* [X86] Do not custom lower UINT_TO_FP when the target type does notQuentin Colombet2014-11-211-0/+5
| | | | | | | | match the custom lowering. <rdar://problem/19026326> llvm-svn: 222489
* Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/Reid Kleckner2014-11-204-5/+5
| | | | | | Follow up to r221940, where I must not have caught em all. NFC llvm-svn: 222481
* Add out of line virtual destructors to all LLVMTargetMachine subclassesReid Kleckner2014-11-2022-3/+37
| | | | | | | | | | | | | | | | | These recently all grew a unique_ptr<TargetLoweringObjectFile> member in r221878. When anyone calls a virtual method of a class, clang-cl requires all virtual methods to be semantically valid. This includes the implicit virtual destructor, which triggers instantiation of the unique_ptr destructor, which fails because the type being deleted is incomplete. This is just part of the ongoing saga of PR20337, which is affecting Blink as well. Because the MSVC ABI doesn't have key functions, we end up referencing the vtable and implicit destructor on any virtual call through a class. We don't actually end up emitting the dtor, so it'd be good if we could avoid this unneeded type completion work. llvm-svn: 222480
* Update Makefile following directory removal in r222466Mehdi Amini2014-11-201-1/+1
| | | | llvm-svn: 222475
* [Hexagon] [NFC] Merging InstPrinter directory in to MCTargetDesc since they ↵Colin LeMahieu2014-11-2011-47/+6
| | | | | | have a circular dependency. llvm-svn: 222458
* X86: use the correct alloca symbol for Windows ItaniumSaleem Abdulrasool2014-11-202-2/+8
| | | | | | | Windows itanium targets the MSVCRT, and the stack probe symbol is provided by MSVCRT. This corrects the emission of stack probes on i686-windows-itanium. llvm-svn: 222439
* [ELF] Prevent ARM ELF object writer from generating deprecated relocation ↵Jyoti Allur2014-11-201-1/+1
| | | | | | code R_ARM_PLT32 llvm-svn: 222414
* Fix a typo in a comment.Craig Topper2014-11-201-1/+1
| | | | llvm-svn: 222412
* [Hexagon] Adding A2_xor instruction with IR selection pattern and test.Colin LeMahieu2014-11-192-6/+6
| | | | llvm-svn: 222399
* [Hexagon] Adding A2_or instruction with IR selection pattern and test.Colin LeMahieu2014-11-192-3/+6
| | | | llvm-svn: 222396
* [X86] Improved lowering of v4x32 build_vector dag nodes.Andrea Di Biagio2014-11-191-58/+90
| | | | | | | | | | | | | | | | | | This patch improves the lowering of v4f32 and v4i32 build_vector dag nodes that are known to have at least two non-zero elements. With this patch, a build_vector that performs a blend with zero is converted into a shuffle. This is done to let the shuffle legalizer expand the dag node in a optimal way. For example, if we know that a build_vector performs a blend with zero, we can try to lower it as a movq/blend instead of always selecting an insertps. This patch also improves the logic that lowers a build_vector into a insertps with zero masking. See for example the extra test cases added to test sse41.ll. Differential Revision: http://reviews.llvm.org/D6311 llvm-svn: 222375
* R600/SI: Make SIInstrInfo::isOperandLegal() more strictTom Stellard2014-11-191-1/+10
| | | | | | | | | | | | A register operand that has a common sub-class with its instruction's defined register class is not always legal. For example, SReg_32 and M0Reg both have a common sub-class, but we can't use an SReg_32 in instructions that expect a M0Reg. This prevents the llvm.SI.sendmsg.ll test from failing when the fold operand pass is added. llvm-svn: 222368
* [mips][micromips] Implement SWM32 and LWM32 instructionsZoran Jovanovic2014-11-1910-6/+288
| | | | | | Differential Revision: http://reviews.llvm.org/D5519 llvm-svn: 222367
* [mips][microMIPS] Fix opcodes of MFHC1 and MTHC1 instructions.Jozef Kolek2014-11-191-4/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D6169 llvm-svn: 222355
* [mips][microMIPS] Implement CodeGen support for 16-bit instruction ADDIUR2.Jozef Kolek2014-11-191-0/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D5800 llvm-svn: 222352
* [mips][microMIPS] Implement CodeGen support for ADDIUS5 instruction.Jozef Kolek2014-11-192-3/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D5799 llvm-svn: 222351
* [mips][microMIPS] Implement LWXS instruction.Jozef Kolek2014-11-192-0/+23
| | | | | | Differential Revision: http://reviews.llvm.org/D5407 llvm-svn: 222348
* [mips][microMIPS] Implement SDBBP and RDHWR instructions.Jozef Kolek2014-11-194-4/+31
| | | | | | Differential Revision: http://reviews.llvm.org/D5240 llvm-svn: 222347
* [X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2Simon Pilgrim2014-11-191-31/+158
| | | | | | | | | | This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain. I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr. Differential Revision: http://reviews.llvm.org/D5699 llvm-svn: 222340
* Update SetVector to rely on the underlying set's insert to return a ↵David Blaikie2014-11-195-8/+9
| | | | | | | | | | | | | pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334
* [AArch64] Disable useAA for Cortex-A57.Hao Liu2014-11-191-1/+1
| | | | | | | | Using AA during CodeGen is very useful for in-order cores. It is less useful for ooo cores. Also I find enabling useAA for Cortex-A57 may generate worse code for some test cases. If useAA in codegen is improved and benefical for ooo cores, we can enable it again. llvm-svn: 222333
* [AArch64] Enable SeparateConstOffsetFromGEP, EarlyCSE and LICM passes on ↵Hao Liu2014-11-191-0/+18
| | | | | | | | | | | | AArch64 backend. SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE and LICM. By enabling these passes we can have better address calculations and generate a better addressing mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57. Reviewed in http://reviews.llvm.org/D5864. llvm-svn: 222331
* Remove StringMap::GetOrCreateValue in favor of StringMap::insertDavid Blaikie2014-11-192-2/+2
| | | | | | | | | | | | | | Having two ways to do this doesn't seem terribly helpful and consistently using the insert version (which we already has) seems like it'll make the code easier to understand to anyone working with standard data structures. (I also updated many references to the Entry's key and value to use first() and second instead of getKey{Data,Length,} and get/setValue - for similar consistency) Also removes the GetOrCreateValue functions so there's less surface area to StringMap to fix/improve/change/accommodate move semantics, etc. llvm-svn: 222319
* [Aarch64] Customer lowering of CTPOP to SIMD should check for NEON availabilityWeiming Zhao2014-11-191-0/+3
| | | | llvm-svn: 222292
* R600/SI: Implement areMemAccessesTriviallyDisjointMatt Arsenault2014-11-192-0/+91
| | | | | | | | | This partially makes up for not having address spaces used for alias analysis in some simple cases. This is not yet enabled by default so shouldn't change anything yet. llvm-svn: 222286
* R600/SI: Set hasSideEffects = 0 on load and store instructions.Matt Arsenault2014-11-182-4/+9
| | | | | | | | | | Assuming unmodeled side effects interferes with some scheduling opportunities. Don't put it in the base class of DS instructions since there are a few weird effecting, non load/store instructions there. llvm-svn: 222285
* [X86][AVX] 256-bit vector stack unaligned load/stores identificationSimon Pilgrim2014-11-181-0/+6
| | | | | | | | | | | | Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills. This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills. This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll Differential Revision: http://reviews.llvm.org/D6252 llvm-svn: 222281
* [Hexagon] Adding A2_and instruction.Colin LeMahieu2014-11-182-4/+7
| | | | llvm-svn: 222274
* [FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmeticChad Rosier2014-11-181-2/+3
| | | | | | | | | shift-right for booleans (i1). Arithmetic shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222272
* [FastISel][AArch64] Also allow folding of sign-/zero-extend and logicalChad Rosier2014-11-181-2/+3
| | | | | | | | | shift-right for booleans (i1). Logical shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222270
* [Hexagon] Adding A2_sub instructionColin LeMahieu2014-11-181-0/+2
| | | | | | Renaming test files. llvm-svn: 222263
* [FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for ↵Juergen Ributzka2014-11-181-17/+26
| | | | | | | | | | | "zero" shifts." Shifts also perform sign-/zero-extends to larger types, which requires us to emit an integer extend instead of a simple COPY. Related to PR21594. llvm-svn: 222257
* R600/SI: Move SIFixSGPRCopies to inst selector passesMatt Arsenault2014-11-181-7/+9
| | | | | | | | | | | | This should expose more of the actually used VALU instructions to the machine optimization passes. This also should help getting i1 handling into a better state. For not entirly understood reasons, this fixes the split-scalar-i64-add.ll test where a 64-bit add would only partially be moved to the VALU resulting in use of undefined VCC. llvm-svn: 222256
* [AArch64] Don't optimize all compare instructions.Juergen Ributzka2014-11-181-26/+51
| | | | | | | | | | | | | | | | "optimizeCompareInstr" converts compares (cmp/cmn) into plain sub/add instructions when the flags are not used anymore. This conversion is valid for most instructions, but not all. Some instructions that don't set the flags (e.g. sub with immediate) can set the SP, whereas the flag setting version uses the same encoding for the "zero" register. Update the code to also check for the return register before performing the optimization to make sure that a cmp doesn't suddenly turn into a sub that sets the stack pointer. I don't have a test case for this, because it isn't easy to trigger. llvm-svn: 222255
* R600/SI: Make sure resource descriptors are always stored in SGPRsTom Stellard2014-11-181-2/+2
| | | | llvm-svn: 222253
* [Hexagon] Converting from ADD_rr to A2_add which has encoding bits.Colin LeMahieu2014-11-187-15/+128
| | | | | | Adding test to show correct instruction selection and encoding. llvm-svn: 222249
* [FastISel][AArch64] Fix shift-immediate emission for "zero" shifts.Juergen Ributzka2014-11-181-6/+33
| | | | | | | | This change emits a COPY for a shift-immediate with a "zero" shift value. This fixes PR21594 where we emitted a shift instruction with an incorrect immediate operand. llvm-svn: 222247
* Test commit to verify that commit access works.Jozef Kolek2014-11-181-1/+1
| | | | llvm-svn: 222244
* Revert "ADT: correctly report isMSVCEnvironment for windows itanium"Reid Kleckner2014-11-172-2/+2
| | | | | | This reverts commit r222180. llvm-svn: 222188
* ADT: correctly report isMSVCEnvironment for windows itaniumSaleem Abdulrasool2014-11-172-2/+2
| | | | | | | The itanium environment on Windows uses MSVC and is a MSVC environment. Report this correctly. llvm-svn: 222180
* R600/SI: Don't copy flags when extracting subregMatt Arsenault2014-11-171-6/+8
| | | | | | | | | This was resulting in use of a register after a kill. For some reason this showed up as a problem in many tests when moving the SIFixSGPRCopies pass closer to instruction selection. llvm-svn: 222175
* R600/SI: Assume SIFixSGPRCopies makes changesMatt Arsenault2014-11-171-1/+2
| | | | | | I'm not sure if this was breaking anything. llvm-svn: 222174
* [X86] Use ADD/SUB instead of INC/DEC for Haswell and Broadwell CPUsAlexey Volkov2014-11-171-2/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D5934 llvm-svn: 222141
* [Thumb1] Re-write emitThumbRegPlusImmediateOliver Stannard2014-11-171-136/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This was motivated by a bug which caused code like this to be miscompiled: declare void @take_ptr(i8*) define void @test() { %addr1.32 = alloca i8 %addr2.32 = alloca i32, i32 1028 call void @take_ptr(i8* %addr1) ret void } This was emitting the following assembly to get the value of %addr1: add r0, sp, #1020 add r0, r0, #8 However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this could not be assembled. The generated object file contained this, resulting in r0 holding SP+8 rather tha SP+1028: add r0, sp, #1020 add r0, sp, #8 This function looked like it could have caused miscompilations for other combinations of registers and offsets (though I don't think it is currently called with these), and the heuristic it used did not match the emitted code in all cases. llvm-svn: 222125
* Convert some EVTs to MVTs where only a SimpleValueType is needed.Craig Topper2014-11-161-1/+1
| | | | llvm-svn: 222109
* [x86] Remove two redundant isel patterns. They equivalent already exists in ↵Craig Topper2014-11-161-5/+0
| | | | | | the instruction pattern. llvm-svn: 222094
OpenPOWER on IntegriCloud