summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [SystemZ] Be more careful about inverting CC masks (conditional loads)Richard Sandiford2013-07-315-31/+31
| | | | | | | | Extend r187495 to conditional loads. I split this out because the easiest way seemed to be to force a particular operand order in SystemZISelDAGToDAG.cpp. llvm-svn: 187496
* [SystemZ] Be more careful about inverting CC masksRichard Sandiford2013-07-319-81/+137
| | | | | | | | | | | | | | | | | | | | | | | | System z branches have a mask to select which of the 4 CC values should cause the branch to be taken. We can invert a branch by inverting the mask. However, not all instructions can produce all 4 CC values, so inverting the branch like this can lead to some oddities. For example, integer comparisons only produce a CC of 0 (equal), 1 (less) or 2 (greater). If an integer EQ is reversed to NE before instruction selection, the branch will test for 1 or 2. If instead the branch is reversed after instruction selection (by inverting the mask), it will test for 1, 2 or 3. Both are correct, but the second isn't really canonical. This patch therefore keeps track of which CC values are possible and uses this when inverting a mask. Although this is mostly cosmestic, it fixes undefined behavior for the CIJNLH in branch-08.ll. Another fix would have been to mask out bit 0 when generating the fused compare and branch, but the point of this patch is that we shouldn't need to do that in the first place. The patch also makes it easier to reuse CC results from other instructions. llvm-svn: 187495
* [SystemZ] Move compare-and-branch generation even laterRichard Sandiford2013-07-314-136/+119
| | | | | | | | | | | | | | | | | | | | | | | r187116 moved compare-and-branch generation from the instruction-selection pass to the peephole optimizer (via optimizeCompare). It turns out that even this is a bit too early. Fused compare-and-branch instructions don't interact well with predication, where a CC result is needed. They also make it harder to reuse the CC side-effects of earlier instructions (not yet implemented, but the subject of a later patch). Another problem was that the AnalyzeBranch family of routines weren't handling compares and branches, so we weren't able to reverse the fused form in cases where we would reverse a separate branch. This could have been fixed by extending AnalyzeBranch, but given the other problems, I've instead moved the fusing to the long-branch pass, which is also responsible for the opposite transformation: splitting out-of-range compares and branches into separate compares and long branches. I've added a test for the AnalyzeBranch problem. A test for the predication problem is included in the next patch, which fixes a bug in the choice of CC mask. llvm-svn: 187494
* Fixed assertion in Extract128BitVector()Elena Demikhovsky2013-07-311-1/+2
| | | | llvm-svn: 187493
* [SystemZ] Postpone NI->RISBG conversion to convertToThreeAddress()Richard Sandiford2013-07-314-79/+193
| | | | | | | | | | | | | | | | | | | | | | r186399 aggressively used the RISBG instruction for immediate ANDs, both because it can handle some values that AND IMMEDIATE can't, and because it allows the destination register to be different from the source. I realized later while implementing the distinct-ops support that it would be better to leave the choice up to convertToThreeAddress() instead. The AND IMMEDIATE form is shorter and is less likely to be cracked. This is a problem for 32-bit ANDs because we assume that all 32-bit operations will leave the high word untouched, whereas RISBG used in this way will either clear the high word or copy it from the source register. The patch uses the z196 instruction RISBLG for this instead. This means that z10 will be restricted to NILL, NILH and NILF for 32-bit ANDs, but I think that should be OK for now. Although we're using z10 as the base architecture, the optimization work is going to be focused more on z196 and zEC12. llvm-svn: 187492
* Added INSERT and EXTRACT intructions from AVX-512 ISA.Elena Demikhovsky2013-07-316-181/+811
| | | | | | | | | All insertf*/extractf* functions replaced with insert/extract since we have insertf and inserti forms. Added lowering for INSERT_VECTOR_ELT / EXTRACT_VECTOR_ELT for 512-bit vectors. Added lowering for EXTRACT/INSERT subvector for 512-bit vectors. Added a test. llvm-svn: 187491
* [SystemZ] Add RISBLG and RISBHG instruction definitionsRichard Sandiford2013-07-314-3/+19
| | | | | | The next patch will make use of RISBLG for codegen. llvm-svn: 187490
* Increment arg_count inside the loop in printInline. Patch by Joe Matarazzo.Craig Topper2013-07-311-1/+1
| | | | llvm-svn: 187477
* Changed register names (and pointer keywords) to be lower case when using ↵Craig Topper2013-07-319-230/+225
| | | | | | | | Intel X86 assembler syntax. Patch by Richard Mitton. llvm-svn: 187476
* Remove trailing whitespace and some tab characters.Craig Topper2013-07-311-9/+9
| | | | llvm-svn: 187472
* Fixed incorrect disassembly for MOV16o16a when using Intel syntax.Craig Topper2013-07-311-2/+2
| | | | | | Patch by Richard Mitton. llvm-svn: 187471
* [mips] Rename instruction DANDi to ANDi64.Akira Hatanaka2013-07-311-4/+4
| | | | | | No functionality change. llvm-svn: 187469
* [mips] Define instruction itineraries IIArith and IILogic.Akira Hatanaka2013-07-315-38/+49
| | | | | | No functionality change. llvm-svn: 187468
* [mips] Delete instruction format for "bal".Akira Hatanaka2013-07-301-11/+0
| | | | llvm-svn: 187443
* [mips] Define "bal" as a pseudo instruction. Also, fix bug in the InstAlias thatAkira Hatanaka2013-07-302-5/+9
| | | | | | turns "bal" into "bgezal". llvm-svn: 187440
* [Sparc] Rewrite MBB's live-in registers for leaf functions. Also, addVenkatraman Govindaraju2013-07-302-7/+20
| | | | | | | | register i7 as a live-in if current function's return address is taken. This revision fixes PR16269. llvm-svn: 187433
* R600/SI: Expand vector fp <-> int conversionsTom Stellard2013-07-302-4/+4
| | | | llvm-svn: 187421
* This patch implements parsing of mips FCC register operands. The example ↵Vladimir Medic2013-07-304-14/+66
| | | | | | instructions have been added to test files. llvm-svn: 187410
* [ARM] check bitwidth in PerformORCombineSaleem Abdulrasool2013-07-301-14/+21
| | | | | | | | | | | | | When simplifying a (or (and B A) (and C ~A)) to a (VBSL A B C) ensure that the bitwidth of the second operands to both ands match before comparing the negation of the values. Split the check of the value of the second operands to the ands. Move the cast and variable declaration slightly higher to make it slightly easier to follow. Bug-Id: 16700 Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org> llvm-svn: 187404
* [Sparc] Use call's debugloc for the unimp instruction.Venkatraman Govindaraju2013-07-301-1/+1
| | | | llvm-svn: 187402
* [PowerPC] Skeletal FastISel support for 64-bit PowerPC ELF.Bill Schmidt2013-07-305-1/+347
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first of many upcoming patches for PowerPC fast instruction selection support. This patch implements the minimum necessary for a functional (but extremely limited) FastISel pass. It allows the table-generated portions of the selector to be created and used, but in most cases selection will fall back to the DAG selector. None of the block terminator instructions are implemented yet, and most interesting instructions require some special handling. Therefore there aren't any new test cases with this patch. There will be quite a few tests coming with future patches. This patch adds the make/CMake support for the new code (including tablegen -gen-fast-isel) and creates the FastISel object for PPC64 ELF only. It instantiates the necessary virtual functions (TargetSelectInstruction, TargetMaterializeConstant, TargetMaterializeAlloca, tryToFoldLoadIntoMI, and FastLowerArguments), but of these, only TargetMaterializeConstant contains any useful implementation. This is present since the table-generated code requires the ability to materialize integer constants for some instructions. This patch has been tested by building and running the projects/test-suite code with -O0. All tests passed with the exception of a couple of long-running tests that time out using -O0 code generation. llvm-svn: 187399
* [R600] Replicate old DAGCombiner behavior in target specific DAG combine.Quentin Colombet2013-07-301-0/+56
| | | | | | | build_vector is lowered to REG_SEQUENCE, which is something the register allocator does a good job at optimizing. llvm-svn: 187397
* [mips] Add comment and simplify function.Akira Hatanaka2013-07-291-23/+14
| | | | llvm-svn: 187371
* Use proper section suffix for COFF weak symbolsNico Rieck2013-07-291-12/+17
| | | | | | | | | 32-bit symbols have "_" as global prefix, but when forming the name of COMDAT sections this prefix is ignored. The current behavior assumes that this prefix is always present which is not the case for 64-bit and names are truncated. llvm-svn: 187356
* Proper va_arg/va_copy lowering on win64Nico Rieck2013-07-291-1/+3
| | | | | | | Win64 uses CharPtrBuiltinVaList instead of X86_64ABIBuiltinVaList like other 64-bit targets. llvm-svn: 187355
* Allow generation of vmla.f32 instructions when targeting Cortex-A15. The ↵Silviu Baranga2013-07-294-4/+6
| | | | | | patch also adds the VFP4 feature to Cortex-A15 and fixes the DontUseFusedMAC predicate so that we can still generate vmla.f32 instructions on non-darwin targets with VFP4. llvm-svn: 187349
* test commitRobert Lytton2013-07-291-0/+1
| | | | llvm-svn: 187348
* Added encoding prefixes for KNL instructions (EVEX).Elena Demikhovsky2013-07-2813-22/+441
| | | | | | | Added 512-bit operands printing. Added instruction formats for KNL instructions. llvm-svn: 187324
* [PowerPC] Add comment explaining preprocessor directive.Bill Schmidt2013-07-281-0/+2
| | | | llvm-svn: 187320
* Revert 187318Bill Schmidt2013-07-281-1/+1
| | | | llvm-svn: 187319
* [PowerPC] Remove unnecessary preprocessor checking.Bill Schmidt2013-07-281-1/+1
| | | | | | | | | The tests !defined(__ppc__) && !defined(__powerpc__) are not needed or helpful when verifying that code is being compiled for a 64-bit target. The simpler test provided by this revision is sufficient to tell if the target is 64-bit. llvm-svn: 187318
* Create a constant pool symbol for the GOT in the ARMCGBR the same way weChandler Carruth2013-07-271-7/+8
| | | | | | | | | | | | | | do in the SDag when lowering references to the GOT: use ARMConstantPoolSymbol rather than creating a dummy global variable. The computation of the alignment still feels weird (it uses IR types and datalayout) but it preserves the exact previous behavior. This change fixes the memory leak of the global variable detected on the valgrind leak checking bot. Thanks to Benjamin Kramer for pointing me at ARMConstantPoolSymbol to handle this use case. llvm-svn: 187303
* Fix yet another memory leak found by the vg-leak bot. Folks (includingChandler Carruth2013-07-271-2/+6
| | | | | | | | | | | | | | | | me) should start watching this bot more as its catching lots of bugs. The fix here is to not construct the global if we aren't going to need it. That's cheaper anyways, and globals have highly predictable types in practice. I've added an assert to catch skew between our manual testing of the type and the actual type just for paranoia's sake. Note that this pattern is actually fine in most globals because when you build a global with a module it automatically is moved to be owned by that module. But here, we're in isel and don't really want to do that. The solution of not creating a global is simpler anyways. llvm-svn: 187302
* Fix a memory leak in the hexagon scheduler. We call initialize here moreChandler Carruth2013-07-271-0/+2
| | | | | | | | | | | | | than once, and the second time through we leaked memory. Found thanks to the vg-leak bot, but I can't locally reproduce it with valgrind. The debugger confirms that it is in fact leaking here. This whole code is totally gross. Why is initialize being called on each runOnFunction??? Why aren't these OwningPtr<>s, and why aren't their lifetimes better defined? Anyways, this is just a surgical change to help out the leak checking bots. llvm-svn: 187299
* SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵Tom Stellard2013-07-275-0/+110
| | | | | | | | | | | | | | conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
* Revert "[PowerPC] Improve consistency in use of __ppc__, __powerpc__, etc."Rafael Espindola2013-07-261-3/+3
| | | | | | This reverts commit r187248. It broke many bots. llvm-svn: 187254
* [PowerPC] Improve consistency in use of __ppc__, __powerpc__, etc.Bill Schmidt2013-07-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Both GCC and LLVM will implicitly define __ppc__ and __powerpc__ for all PowerPC targets, whether 32- or 64-bit. They will both implicitly define __ppc64__ and __powerpc64__ for 64-bit PowerPC targets, and not for 32-bit targets. We cannot be sure that all other possible compilers used to compile Clang/LLVM define both __ppc__ and __powerpc__, for example, so it is best to check for both when relying on either inside the Clang/LLVM code base. This patch makes sure we always check for both variants. In addition, it fixes one unnecessary check in lib/Target/PowerPC/PPCJITInfo.cpp. (At least one of __ppc__ and __powerpc__ should always be defined when compiling for a PowerPC target, no matter which compiler is used, so testing for them is unnecessary.) There are some places in the compiler that check for other variants, like __POWERPC__ and _POWER, and I have left those in place. There is no need to add them elsewhere. This seems to be in Apple-specific code, and I won't take a chance on breaking it. There is no intended change in behavior; thus, no test cases are added. llvm-svn: 187248
* [mips] Implement llvm.trap intrinsic.Akira Hatanaka2013-07-262-0/+7
| | | | | | Patch by Sasa Stankovic. llvm-svn: 187244
* [mips] Fix FP conditional move instructions to have explicit FP condition codeAkira Hatanaka2013-07-264-13/+14
| | | | | | register operands. llvm-svn: 187242
* [mips] Fix FP branch instructions to have explicit FP condition code registerAkira Hatanaka2013-07-265-25/+41
| | | | | | operands. llvm-svn: 187238
* [mips] Increase the number of floating point condition code registers to eight.Akira Hatanaka2013-07-261-3/+5
| | | | llvm-svn: 187234
* [mips] Fix floating point branch, comparison, and conditional move instructionsAkira Hatanaka2013-07-262-4/+4
| | | | | | | | | to have register FCC0 (the first floating point condition code register) in their Uses/Defs list. No intended functionality change. llvm-svn: 187233
* [mips] Delete register print method MipsInstPrinter::printCPURegs that is notAkira Hatanaka2013-07-263-11/+5
| | | | | | | | needed. The generic method printOperand will do. No functionality change. llvm-svn: 187231
* [mips] Print instructions "beq", "bne" and "or" using assembler pseudoAkira Hatanaka2013-07-262-1/+57
| | | | | | | | | | instructions "beqz", "bnez" and "move", when possible. beq $2, $zero, $L1 => beqz $2, $L1 bne $2, $zero, $L1 => bnez $2, $L1 or $2, $3, $zero => move $2, $3 llvm-svn: 187229
* Add a target legalize hook for SplitVectorOperand (again)Justin Holewinski2013-07-261-1/+1
| | | | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 Attempt to fix the buildbots by making the X86 test I just added platform independent llvm-svn: 187202
* Revert "Add a target legalize hook for SplitVectorOperand"Rafael Espindola2013-07-261-1/+1
| | | | | | | | | | This reverts commit 187198. It broke the bots. The soft float test probably needs a -triple because of name differences. On the hard float test I am getting a "roundss $1, %xmm0, %xmm0", instead of "vroundss $1, %xmm0, %xmm0, %xmm0". llvm-svn: 187201
* Add a target legalize hook for SplitVectorOperandJustin Holewinski2013-07-261-1/+1
| | | | | | | | | | | | CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 llvm-svn: 187198
* test commitRichard Osborne2013-07-261-1/+1
| | | | llvm-svn: 187195
* [XCore] Add TODO regarding byval structsRichard Osborne2013-07-261-0/+2
| | | | llvm-svn: 187193
* Fix more Intel syntax issues with FP instruction aliases. Test cases coming ↵Craig Topper2013-07-261-8/+8
| | | | | | in a subsequent patch. llvm-svn: 187187
OpenPOWER on IntegriCloud