summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* [PowerPC] Use true offset value in "memrix" machine operandsUlrich Weigand2013-05-1612-194/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix" type operands. This is about instructions like LD/STD that only have a 14-bit field to encode immediate offsets, which are implicitly extended by two zero bits by the machine, so that in effect we can access 16-bit offsets as long as they are a multiple of 4. The PowerPC back end currently handles such instructions by carrying the 14-bit value (as it will get encoded into the actual machine instructions) in the machine operand fields for such instructions. This means that those values are in fact not the true offset, but rather the offset divided by 4 (and then truncated to an unsigned 14-bit value). Like in the case fixed in r182012, this makes common code operations on such offset values not work as expected. Furthermore, there doesn't really appear to be any strong reason why we should encode machine operands this way. This patch therefore changes the encoding of "memrix" type machine operands to simply contain the "true" offset value as a signed immediate value, while enforcing the rules that it must fit in a 16-bit signed value and must also be a multiple of 4. This change must be made simultaneously in all places that access machine operands of this type. However, just about all those changes make the code simpler; in many cases we can now just share the same code for memri and memrix operands. llvm-svn: 182032
* PPC32 cannot form counter loops around i64 FP conversionsHal Finkel2013-05-162-1/+33
| | | | | | | On PPC32, i64 FP conversions are implemented using runtime calls (which clobber the counter register). These must be excluded. llvm-svn: 182023
* Add a triple to the test to try to fix the windows bots.Rafael Espindola2013-05-161-1/+1
| | | | llvm-svn: 182022
* More addFrameMove test coverage.Rafael Espindola2013-05-161-1/+9
| | | | llvm-svn: 182021
* Use new CHECK-DAG support to stabilize CodeGen/PowerPC/recipest.llBill Schmidt2013-05-161-16/+16
| | | | | | | | | While testing some experimental code to add vector-scalar registers to PowerPC, I noticed that a couple of independent instructions were flipped by the scheduler. The new CHECK-DAG support is perfect for avoiding this problem. llvm-svn: 182020
* Add more addFrameMove test coverage.Rafael Espindola2013-05-161-0/+6
| | | | llvm-svn: 182019
* Fixing a 64-bit conversion warning in MSVC.Aaron Ballman2013-05-161-1/+1
| | | | llvm-svn: 182018
* Add more test coverage for addFrameMove.Rafael Espindola2013-05-161-0/+5
| | | | llvm-svn: 182017
* Remove dead calls to addFrameMove.Rafael Espindola2013-05-161-25/+0
| | | | | | Without a PROLOG_LABEL present, the cfi instructions are never printed. llvm-svn: 182016
* [PowerPC] Report true displacement value from getPreIndexedAddressPartsUlrich Weigand2013-05-162-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to decompose a memory address into a base/offset pair. It expects the offset (if constant) to be the true displacement value in order to perform optional additional optimizations; in particular, to convert other uses of the original pointer into uses of the new base pointer after pre-increment. The PowerPC implementation of getPreIndexedAddressParts, however, simply calls SelectAddressRegImm, which returns a TargetConstant. This value is appropriate for encoding into the instruction, but it is not always usable as true displacement value: - Its type is always MVT::i32, even on 64-bit, where addresses ought to be i64 ... this causes the optimization to simply always fail on 64-bit due to this line in DAGCombiner: // FIXME: In some cases, we can be smarter about this. if (Op1.getValueType() != Offset.getValueType()) { - Its value is truncated to an unsigned 16-bit value if negative. This causes the above opimization to generate wrong code. This patch fixes both problems by simply returning the true displacement value (in its original type). This doesn't affect any other user of the displacement. llvm-svn: 182012
* Add more addFrameMove test coverage.Rafael Espindola2013-05-161-1/+6
| | | | llvm-svn: 182011
* Extend test to check the .cfi instructions.Rafael Espindola2013-05-161-1/+15
| | | | | | | I am about to refactor the calls to addFrameMove and some of the ppc ones were not being tested. llvm-svn: 182009
* [SystemZ] Tweak register array commentRichard Sandiford2013-05-161-2/+5
| | | | llvm-svn: 182007
* Relax CHECK-NEXTs a bit to cope with atom's return nop padding.Benjamin Kramer2013-05-161-2/+2
| | | | llvm-svn: 181999
* [msan] Switch TLS globals to initial-exec model.Evgeniy Stepanov2013-05-162-8/+17
| | | | | | They are always defined in the main executable. llvm-svn: 181994
* Removed unused variable, detected by gccPatrik Hagglund2013-05-161-2/+0
| | | | | | -Wunused-but-set-variable. Leftover from r181979. llvm-svn: 181993
* Delete dead code.Rafael Espindola2013-05-161-1/+2
| | | | llvm-svn: 181982
* Don't call addFrameMove on XCore.Rafael Espindola2013-05-161-34/+0
| | | | | | | | getExceptionHandlingType is not ExceptionHandling::DwarfCFI on xcore, so etFrameInstructions is never called. There is no point creating cfi instructions if they are never used. llvm-svn: 181979
* Respect the 'nobuiltin' attribute when determining if a call is to a memory ↵Richard Smith2013-05-162-0/+21
| | | | | | builtin. llvm-svn: 181978
* Extend test for better coverage.Rafael Espindola2013-05-161-1/+6
| | | | | | | | | | | | | | | | | Without this change nothing was covering this addFrameMove: // For 64-bit SVR4 when we have spilled CRs, the spill location // is SP+8, not a frame-relative slot. if (Subtarget.isSVR4ABI() && Subtarget.isPPC64() && (PPC::CR2 <= Reg && Reg <= PPC::CR4)) { MachineLocation CSDst(PPC::X1, 8); MachineLocation CSSrc(PPC::CR2); MMI.addFrameMove(Label, CSDst, CSSrc); continue; } llvm-svn: 181976
* Removed dead code.Rafael Espindola2013-05-161-8/+4
| | | | llvm-svn: 181975
* Fix PBQP graph iterator typedefs.Lang Hames2013-05-161-4/+4
| | | | llvm-svn: 181973
* Patch number 2 for mips16/32 floating point interoperability stubs.Reed Kotler2013-05-162-3/+337
| | | | | | | | | This creates stubs that help Mips32 functions call Mips16 functions which have floating point parameters that are normally passed in floating point registers. llvm-svn: 181972
* Revert "Support unaligned load/store on more ARM targets"Derek Schuff2013-05-151-17/+4
| | | | | | This reverts r181898. llvm-svn: 181944
* Remove dead code.Eli Bendersky2013-05-152-21/+0
| | | | | | This method is not being used/tested anywhere. llvm-svn: 181943
* LoopVectorize: Move call of canHoistAllLoads to canVectorizeWithIfConvertArnold Schwaighofer2013-05-151-4/+4
| | | | | | | | | We only want to check this once, not for every conditional block in the loop. No functionality change (except that we don't perform a check redudantly anymore). llvm-svn: 181942
* Delete dead code.Rafael Espindola2013-05-151-19/+9
| | | | llvm-svn: 181941
* Set an explicit triple for this test.David Majnemer2013-05-151-1/+1
| | | | | | This allows the test to correctly check symbol names. llvm-svn: 181939
* undef setjmp in PPCCTRLoopsHal Finkel2013-05-151-0/+16
| | | | | | | Trying to unbreak the VS build by copying some undef code from Utils/LowerInvoke.cpp. llvm-svn: 181938
* X86: Remove redundant test instructionsDavid Majnemer2013-05-152-7/+193
| | | | | | | | Increase the number of instructions LLVM recognizes as setting the ZF flag. This allows us to remove test instructions that redundantly recalculate the flag. llvm-svn: 181937
* Use proper syntax.Bill Wendling2013-05-151-1/+1
| | | | llvm-svn: 181930
* Implement PPC counter loops as a late IR-level passHal Finkel2013-05-1515-674/+1834
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). llvm-svn: 181927
* Fix legalization of SETCC with promoted integer intrinsicsHal Finkel2013-05-151-2/+13
| | | | | | | | | | | | | | | | | | If the input operands to SETCC are promoted, we need to make sure that we either use the promoted form of both operands (or neither); a mixture is not allowed. This can happen, for example, if a target has a custom promoted i1-returning intrinsic (where i1 is not a legal type). In this case, we need to use the promoted form of both operands. This change only augments the behavior of the existing logic in the case where the input types (which may or may not have already been legalized) disagree, and should not affect existing target code because this case would otherwise cause an assert in the SETCC operand promotion code. This will be covered by (essentially all of the) tests for the new PPCCTRLoops infrastructure. llvm-svn: 181926
* Add lldb and polly to the projects to tag.Bill Wendling2013-05-151-2/+3
| | | | llvm-svn: 181925
* Fix miscompile due to StackColoring incorrectly merging stack slots (PR15707)Derek Schuff2013-05-152-11/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IR optimisation passes can result in a basic block that contains: llvm.lifetime.start(%buf) ... llvm.lifetime.end(%buf) ... llvm.lifetime.start(%buf) Before this change, calculateLiveIntervals() was ignoring the second lifetime.start() and was regarding %buf as being dead from the lifetime.end() through to the end of the basic block. This can cause StackColoring to incorrectly merge %buf with another stack slot. Fix by removing the incorrect Starts[pos].isValid() and Finishes[pos].isValid() checks. Just doing: Starts[pos] = Indexes->getMBBStartIdx(MBB); Finishes[pos] = Indexes->getMBBEndIdx(MBB); unconditionally would be enough to fix the bug, but it causes some test failures due to stack slots not being merged when they were before. So, in order to keep the existing tests passing, treat LiveIn and LiveOut separately rather than approximating the live ranges by merging LiveIn and LiveOut. This fixes PR15707. Patch by Mark Seaborn. llvm-svn: 181922
* Cleanup relocation sorting for ELF.Rafael Espindola2013-05-153-58/+16
| | | | | | | | | | We want the order to be deterministic on all platforms. NAKAMURA Takumi fixed that in r181864. This patch is just two small cleanups: * Move the function to the cpp file. It is only passed to array_pod_sort. * Remove the ppc implementation which is now redundant llvm-svn: 181910
* PPCISelLowering.h: Escape \@ in comments. [-Wdocumentation]NAKAMURA Takumi2013-05-151-14/+14
| | | | llvm-svn: 181907
* Whitespace.NAKAMURA Takumi2013-05-151-2/+2
| | | | llvm-svn: 181906
* [objc-arc] Fixed a spelling error and made the statistic descriptions be ↵Michael Gottesman2013-05-151-5/+5
| | | | | | consistent about their usage of periods. llvm-svn: 181901
* Add missing #includeDouglas Gregor2013-05-151-0/+1
| | | | llvm-svn: 181900
* Support unaligned load/store on more ARM targetsDerek Schuff2013-05-151-4/+17
| | | | | | | | | | | | | | | | | This patch matches GCC behavior: the code used to only allow unaligned load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for v6+ Darwin as well as for v7+ on other targets. The distinction is made because v6 doesn't guarantee support (but LLVM assumes that Apple controls hardware+kernel and therefore have conformant v6 CPUs), whereas v7 does provide this guarantee (and Linux behaves sanely). Overall this should slightly improve performance in most cases because of reduced I$ pressure. Patch by JF Bastien llvm-svn: 181897
* Remove MCELFObjectTargetWriter::adjustFixupOffset hackUlrich Weigand2013-05-153-9/+0
| | | | | | | | Now that PowerPC no longer uses adjustFixupOffset, and no other back-end (ever?) did, we can remove the infrastructure itself (incidentally addressing a FIXME to that effect). llvm-svn: 181895
* [PowerPC] Remove need for adjustFixupOffst hackUlrich Weigand2013-05-154-66/+52
| | | | | | | | | | | | | | | | Now that applyFixup understands differently-sized fixups, we can define fixup_ppc_lo16/fixup_ppc_lo16_ds/fixup_ppc_ha16 to properly be 2-byte fixups, applied at an offset of 2 relative to the start of the instruction text. This has the benefit that if we actually need to generate a real relocation record, its address will come out correctly automatically, without having to fiddle with the offset in adjustFixupOffset. Tested on both 64-bit and 32-bit PowerPC, using external and integrated assembler. llvm-svn: 181894
* [SystemZ] Make use of SUBTRACT HALFWORDRichard Sandiford2013-05-155-0/+237
| | | | | | Thanks to Ulrich Weigand for noticing that this instruction was missing. llvm-svn: 181893
* [PowerPC] Add test case for r181891Ulrich Weigand2013-05-151-0/+38
| | | | llvm-svn: 181892
* [PowerPC] Correctly handle fixups of other than 4 byte sizeUlrich Weigand2013-05-151-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | The PPCAsmBackend::applyFixup routine handles the case where a fixup can be resolved within the same object file. However, this routine is currently hard-coded to assume the size of any fixup is always exactly 4 bytes. This is sort-of correct for fixups on instruction text; even though it only works because several of what really would be 2-byte fixups are presented as 4-byte fixups instead (requiring another hack in PPCELFObjectWriter::adjustFixupOffset to clean it up). However, this assumption breaks down completely for fixups on data, which legitimately can be of any size (1, 2, 4, or 8). This patch makes applyFixup aware of fixups of varying sizes, introducing a new helper routine getFixupKindNumBytes (along the lines of what the ARM back end does). Note that in order to handle fixups of size 8, we also need to fix the return type of adjustFixupValue to uint64_t to avoid truncation. Tested on both 64-bit and 32-bit PowerPC, using external and integrated assembler. llvm-svn: 181891
* Add Jade to the list of external projects using LLVM in the release notes.Arnaud A. de Grandmaison2013-05-151-0/+14
| | | | | | Patch by: Antoine Lorence <Antoine.Lorence@insa-rennes.fr> llvm-svn: 181886
* [SystemZ] Add more future work items to the READMERichard Sandiford2013-05-151-7/+91
| | | | | | Based on an analysis by Ulrich Weigand. llvm-svn: 181882
* [SystemZ] Consolidate disassembler tests for valid input into 2 big testsRichard Sandiford2013-05-15339-6714/+6953
| | | | llvm-svn: 181879
* [SystemZ] Consolidate assembler tests into 4 big testsRichard Sandiford2013-05-15609-10038/+9176
| | | | llvm-svn: 181878
OpenPOWER on IntegriCloud