summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/PowerPC
Commit message (Collapse)AuthorAgeFilesLines
...
* Make use of previously generated stores in ↵Hal Finkel2014-03-301-0/+6
| | | | | | | | | | | | | | | | | SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. llvm-svn: 205153
* [PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREGHal Finkel2014-03-301-0/+38
| | | | | | | | | | | | | | | | sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node (and similarly for v2i16 and v2i8). Even though there are no sign-extension (or algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by converting two and from v2i64. The small trick necessary here is to shift the i32 elements into the right lanes before the i32 -> f64 step. This is because of the big Endian nature of the system, we need the i32 portion in the high word of the i64 elements. For v2i16 and v2i8 we can do the same, but we first use the default Altivec shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and then apply the above procedure. llvm-svn: 205146
* [PowerPC] Handle v2i64 comparisonsHal Finkel2014-03-291-0/+33
| | | | | | | | v2i64 is a legal type under VSX, however we don't have native vector comparisons. We can handle eq/ne by casting it to an Altivec type, but everything else must be expanded. llvm-svn: 205106
* [PowerPC] Add subregister classes for f64 VSX valuesHal Finkel2014-03-292-3/+166
| | | | | | | | | | | | | We had stored both f64 values and v2f64, etc. values in the VSX registers. This worked, but was suboptimal because we would always spill 16-byte values even through we almost always had scalar 8-byte values. This resulted in an increase in stack-size use, extra memory bandwidth, etc. To fix this, I've added 64-bit subregisters of the Altivec registers, and combined those with the existing scalar floating-point registers to form a class of VSX scalar floating-point registers. The ABI code has also been enhanced to use this register class and some other necessary improvements have been made. llvm-svn: 205075
* [PowerPC] Fix VSX permutation iselHal Finkel2014-03-281-1/+1
| | | | | | | Not only did I invert the indices when I wrote the code, but I also did the same thing when I wrote the regression test. Oops. llvm-svn: 205046
* [PowerPC] v2[fi]64 need to be explicitly passed in VSX registersHal Finkel2014-03-281-0/+26
| | | | | | | | v2[fi]64 values need to be explicitly passed in VSX registers. This is because the code in TRI that finds the minimal register class given a register and a value type will assert if given an Altivec register and a non-Altivec type. llvm-svn: 205041
* [PowerPC] Use a small cleanup pass to remove VSX self copiesHal Finkel2014-03-271-0/+27
| | | | | | | | | | | | As explained in r204976, because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) This adds a small cleanup pass to remove these prior to post-RA scheduling. llvm-svn: 204980
* [PowerPC] Fix v2f64 vector extract and related patternsHal Finkel2014-03-271-0/+18
| | | | | | | | | First, v2f64 vector extract had not been declared legal (and so the existing patterns were not being used). Second, the patterns for that, and for scalar_to_vector, should really be a regclass copy, not a subregister operation, because the VSX registers directly hold both the vector and scalar data. llvm-svn: 204971
* [PowerPC] Expand v2i64 shiftsHal Finkel2014-03-271-0/+42
| | | | | | | | These operations need to be expanded during legalization so that isel does not crash. In theory, we might be able to custom lower some of these. That, however, would need to be follow-up work. llvm-svn: 204963
* [PowerPC] Generate VSX permutations for v2[fi]64 vectorsHal Finkel2014-03-261-0/+65
| | | | llvm-svn: 204873
* [PowerPC] VSX loads and stores support unaligned accessHal Finkel2014-03-261-0/+18
| | | | | | | I've not yet updated PPCTTI because I'm not sure what the actual relative cost is compared to the aligned uses. llvm-svn: 204848
* [PowerPC] Use v2f64 <-> v2i64 VSX conversion instructionsHal Finkel2014-03-261-0/+72
| | | | llvm-svn: 204843
* [PowerPC] Use VSX vector load/stores for v2[fi]64Hal Finkel2014-03-261-0/+36
| | | | | | | | These instructions have access to the complete VSX register file. In addition, they "swap" the order of the elements so that element 0 (the scalar part) comes first in memory and element 1 follows at a higher address. llvm-svn: 204838
* [PowerPC] Add v2i64 as a legal VSX typeHal Finkel2014-03-261-1/+22
| | | | | | | | | v2i64 needs to be a legal VSX type because it is the SetCC result type from v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations. This fixes the lowering for v2f64 VSELECT. llvm-svn: 204828
* [PowerPC] Lower VSELECT using xxsel when VSX is availableHal Finkel2014-03-261-0/+77
| | | | | | | | With VSX there is a real vector select instruction, and so we should use it. Note that VSELECT will still scalarize for v2f64 because the corresponding SetCC result type (v2i64) is not currently a legal type. llvm-svn: 204801
* [PowerPC] Generate logical vector VSX instructionsHal Finkel2014-03-261-0/+156
| | | | | | | These instructions are essentially the same as their Altivec counterparts, but have access to the larger VSX register file. llvm-svn: 204782
* [PowerPC] Select between VSX A-type and M-type FMA instructions just before RAHal Finkel2014-03-251-0/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VSX instruction set has two types of FMA instructions: A-type (where the addend is taken from the output register) and M-type (where one of the product operands is taken from the output register). This adds a small pass that runs just after MI scheduling (and, thus, just before register allocation) that mutates A-type instructions (that are created during isel) into M-type instructions when: 1. This will eliminate an otherwise-necessary copy of the addend 2. One of the product operands is killed by the instruction The "right" moment to make this decision is in between scheduling and register allocation, because only there do we know whether or not one of the product operands is killed by any particular instruction. Unfortunately, this also makes the implementation somewhat complicated, because the MIs are not in SSA form and we need to preserve the LiveIntervals analysis. As a simple example, if we have: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 ... %vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19, %RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19 ... We can eliminate the copy by changing from the A-type to the M-type instruction. This means: %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 is replaced by: %vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9, %RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9 and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 llvm-svn: 204768
* [PowerPC] Make use of VSX f64 <-> i64 conversion instructionsHal Finkel2014-03-233-0/+99
| | | | | | | When VSX is available, these instructions should be used in preference to the older variants that only have access to the scalar floating-point registers. llvm-svn: 204559
* [PowerPC] Fix the VSX v2f64 return registerHal Finkel2014-03-221-3/+1
| | | | | | | v2f64 values, like other 128-bit values, are returned under VSX in register vs34 (Altivec register v2). llvm-svn: 204543
* Remove redundant test.Rafael Espindola2014-03-211-9/+0
| | | | | | This is tested from MC already. llvm-svn: 204491
* Fix PR19144: Incorrect offset generated for int-to-fp conversion at -O0.Bill Schmidt2014-03-181-0/+153
| | | | | | | | | | | | | | | | | | When converting a signed 32-bit integer to double-precision floating point on hardware without a lfiwax instruction, we have to instead use a lfd followed by fcfid. We were erroneously offsetting the address by 4 bytes in preparation for either a lfiwax or lfiwzx when generating the lfd. This fixes that silly error. This was not caught in the test suite since the conversion tests were run with -mcpu=pwr7, which implies availability of lfiwax. I've added another test case for older hardware that checks the code we expect in the absence of lfiwax and other flavors of fcfid. There are fewer tests in this test case because we punt to DAG selection in more cases on older hardware. (We must generate complex fiddly sequences in those cases, and there is marginal benefit in duplicating that logic in fast-isel.) llvm-svn: 204155
* [ppc64] Avoid copy relocs in named rodata sectionsUlrich Weigand2014-03-141-0/+11
| | | | | | | | | | | | | | Commit r181723 introduced code to avoid placing initialized variables needing relocations into the .rodata section, which avoid copy relocs that do not work as expected on ppc64 function references. The same treatment is also needed for *named* .rodata.XXX sections. This patch changes PPC64LinuxTargetObjectFile::SelectSectionForGlobal to modify "Kind" *before* calling the default SelectSectionForGlobal routine, instead of first calling the default routine and then just checking for the (main) .rodata section afterwards. llvm-svn: 203921
* Remove the linker_private and linker_private_weak linkages.Rafael Espindola2014-03-131-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by *name*. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). llvm-svn: 203866
* [PowerPC] Initial support for the VSX instruction setHal Finkel2014-03-131-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. llvm-svn: 203768
* IR: add a second ordering operand to cmpxhg for failureTim Northover2014-03-114-34/+34
| | | | | | | | | | | | | | | The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 llvm-svn: 203559
* Fixup PPC Darwin i1 argument handlingHal Finkel2014-03-061-0/+5
| | | | | | | Like on other targets, we need to zero_extend/truncate i1 args before copying them to GPRs. llvm-svn: 203045
* When using CR bit registers on PPC32, handle the i1 vaarg caseHal Finkel2014-03-061-0/+15
| | | | | | | | When copying an i1 value into a GPR for a vaarg call, we need to explicitly zero-extend the i1 value (otherwise an invalid CRBIT -> GPR copy will be generated). llvm-svn: 203041
* With PPC CR bit registers, handle int_to_fp on older coresHal Finkel2014-03-051-0/+21
| | | | | | | | | On cores without fpcvt support, we cannot promote int_to_fp i1 operations, because there is nothing to promote them to. The most straightforward implementation of this uses a select to choose between the two possible resulting floating-point values (and that's what is done here). llvm-svn: 203015
* Add a PPC inline asm constraint type for single CR bitsHal Finkel2014-03-021-0/+59
| | | | | | | | | | | | | | | | | Now that the PowerPC backend can track individual CR bits as first-class registers, we should also have a way of allocating them for inline asm statements. Because these registers are only one bit, if an output variable is implicitly cast to a larger integer size, we'll get an any_extend to that larger type (this is part of the existing target-independent logic). As a result, regardless of the size of the output type, only the first bit is meaningful. The constraint identifier "wc" has been chosen for this purpose. Although gcc does not currently support allocating individual CR bits, this identifier choice has been coordinated with the gcc PowerPC team, and will be marked as reserved for this purpose in the gcc constraints.md file. llvm-svn: 202657
* Remove extra truncs/exts around i32 bit operations on PPC64Hal Finkel2014-03-011-2/+30
| | | | | | | | | | | | | | | | | | | | | | | | | This generalizes the code to eliminate extra truncs/exts around i1 bit operations to also do the same on PPC64 for i32 bit operations. This eliminates a fairly prevalent code wart: int foo(int a) { return a == 5 ? 7 : 8; } On PPC64, because of the extension implied by the ABI, this would generate: cmplwi 0, 3, 5 li 12, 8 li 4, 7 isel 3, 4, 12, 2 rldicl 3, 3, 0, 32 blr where the 'rldicl 3, 3, 0, 32', the extension, is completely unnecessary. At least for the single-BB case (which is all that the DAG combine mechanism can handle), this unnecessary extension is no longer generated. llvm-svn: 202600
* Swap PPC isel operands to allow for 0-foldingHal Finkel2014-02-281-27/+17
| | | | | | | | | The PPC isel instruction can fold 0 into the first operand (thus eliminating the need to materialize a zero-containing register when the 'true' result of the isel is 0). When the isel is fed by a bit register operation that we can invert, do so as part of the bit-register-operation peephole routine. llvm-svn: 202469
* Add CR-bit tracking to the PowerPC backend for i1 valuesHal Finkel2014-02-2810-9/+212
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change enables tracking i1 values in the PowerPC backend using the condition register bits. These bits can be treated on PowerPC as separate registers; individual bit operations (and, or, xor, etc.) are supported. Tracking booleans in CR bits has several advantages: - Reduction in register pressure (because we no longer need GPRs to store boolean values). - Logical operations on booleans can be handled more efficiently; we used to have to move all results from comparisons into GPRs, perform promoted logical operations in GPRs, and then move the result back into condition register bits to be used by conditional branches. This can be very inefficient, because the throughput of these CR <-> GPR moves have high latency and low throughput (especially when other associated instructions are accounted for). - On the POWER7 and similar cores, we can increase total throughput by using the CR bits. CR bit operations have a dedicated functional unit. Most of this is more-or-less mechanical: Adjustments were needed in the calling-convention code, support was added for spilling/restoring individual condition-register bits, and conditional branch instruction definitions taking specific CR bits were added (plus patterns and code for generating bit-level operations). This is enabled by default when running at -O2 and higher. For -O0 and -O1, where the ability to debug is more important, this feature is disabled by default. Individual CR bits do not have assigned DWARF register numbers, and storing values in CR bits makes them invisible to the debugger. It is critical, however, that we don't move i1 values that have been promoted to larger values (such as those passed as function arguments) into bit registers only to quickly turn around and move the values back into GPRs (such as happens when values are returned by functions). A pair of target-specific DAG combines are added to remove the trunc/extends in: trunc(binary-ops(binary-ops(zext(x), zext(y)), ...) and: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) In short, we only want to use CR bits where some of the i1 values come from comparisons or are used by conditional branches or selects. To put it another way, if we can do the entire i1 computation in GPRs, then we probably should (on the POWER7, the GPR-operation throughput is higher, and for all cores, the CR <-> GPR moves are expensive). POWER7 test-suite performance results (from 10 runs in each configuration): SingleSource/Benchmarks/Misc/mandel-2: 35% speedup MultiSource/Benchmarks/Prolangs-C++/city/city: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan: 23% speedup SingleSource/Benchmarks/CoyoteBench/huffbench: 13% speedup SingleSource/Benchmarks/Misc-C++/Large/sphereflake: 13% speedup SingleSource/Benchmarks/Misc-C++/mandel-text: 10% speedup SingleSource/Benchmarks/Misc-C++-EH/spirit: 10% slowdown MultiSource/Applications/lemon/lemon: 8% slowdown llvm-svn: 202451
* Account for 128-bit integer operations in PPCCTRLoopsHal Finkel2014-02-251-0/+31
| | | | | | | We need to abort the formation of counter-register-based loops where there are 128-bit integer operations that might become function calls. llvm-svn: 202192
* Add back r201608, r201622, r201624 and r201625Rafael Espindola2014-02-191-4/+4
| | | | | | | | | | | | | | r201608 made llvm corretly handle private globals with MachO. r201622 fixed a bug in it and r201624 and r201625 were changes for using private linkage, assuming that llvm would do the right thing. They all got reverted because r201608 introduced a crash in LTO. This patch includes a fix for that. The issue was that TargetLoweringObjectFile now has to be initialized before we can mangle names of private globals. This is trivially true during the normal codegen pipeline (the asm printer does it), but LTO has to do it manually. llvm-svn: 201700
* Revert r201622 and r201608.Daniel Jasper2014-02-191-4/+4
| | | | | | | This causes the LLVMgold plugin to segfault. More information on the replies to r201608. llvm-svn: 201669
* Fix PR18743.Rafael Espindola2014-02-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | The IR @foo = private constant i32 42 is valid, but before this patch we would produce an invalid MachO from it. It was invalid because it would use an L label in a section where the liker needs the labels in order to atomize it. One way of fixing it would be to just reject this IR in the backend, but that would not be very front end friendly. What this patch does is use an 'l' prefix in sections that we know the linker requires symbols for atomizing them. This allows frontends to just use private and not worry about which sections they go to or how the linker handles them. One small issue with this strategy is that now a symbol name depends on the section, which is not available before codegen. This is not a problem in practice. The reason is that it only happens with private linkage, which will be ignored by the non codegen users (llvm-nm and llvm-ar). llvm-svn: 201608
* Actually call FileCheck in testsNico Rieck2014-02-161-1/+1
| | | | llvm-svn: 201491
* "foo" is not a ppc instruction, don't try to parse it.Rafael Espindola2014-02-131-1/+1
| | | | llvm-svn: 201336
* Re-commit: Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵Daniel Sanders2014-02-133-3/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Changes since review (and last commit attempt): - Fixed test failures that were missed due to configuration of local build. (fixes crash.ll and a couple others). - Fixed tests that happened to pass because the local build was on X86 (should fix 2007-12-17-InvokeAsm.ll) - mature-mc-support.ll's should no longer require all targets to be compiled. (should fix ARM and PPC buildbots) - Object output (-filetype=obj and similar) now forces the integrated assembler to be enabled regardless of default setting or -no-integrated-as. (should fix SystemZ buildbots) Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201333
* Revert r201237+r201238: Demote EmitRawText call in ↵Daniel Sanders2014-02-122-3/+3
| | | | | | | | AsmPrinter::EmitInlineAsm() and remove hasRawTextSupport() call It introduced multiple test failures in the buildbots. llvm-svn: 201241
* Demote EmitRawText call in AsmPrinter::EmitInlineAsm() and remove ↵Daniel Sanders2014-02-122-3/+3
| | | | | | | | | | | | | | | | | | | | | hasRawTextSupport() call Summary: AsmPrinter::EmitInlineAsm() will no longer use the EmitRawText() call for targets with mature MC support. Such targets will always parse the inline assembly (even when emitting assembly). Targets without mature MC support continue to use EmitRawText() for assembly output. The hasRawTextSupport() check in AsmPrinter::EmitInlineAsm() has been replaced with MCAsmInfo::UseIntegratedAs which when true, causes the integrated assembler to parse inline assembly (even when emitting assembly output). UseIntegratedAs is set to true for targets that consider any failure to parse valid assembly to be a bug. Target specific subclasses generally enable the integrated assembler in their constructor. The default value can be overridden with -no-integrated-as. All tests that rely on inline assembly supporting invalid assembly (for example, those that use mnemonics such as 'foo' or 'hello world') have been updated to disable the integrated assembler. Reviewers: rafael Reviewed By: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2686 llvm-svn: 201237
* Fix a bug with .weak_def_can_be_hidden: Mutable variables cannot use it.Rafael Espindola2014-02-071-6/+18
| | | | | | Thanks to John McCall for noticing it. llvm-svn: 200977
* Convert test to FileCheck.Rafael Espindola2014-02-061-12/+16
| | | | llvm-svn: 200955
* DebugInfo: Remove some unneeded conditionals now that DIBuilder no longer ↵David Blaikie2014-02-043-3/+3
| | | | | | | | | | | emits zero-length arrays as {i32 0} A bunch of test cases needed to be cleaned up for this, many my fault - when implementid imported modules I updated test cases by simply duplicating the prior metadata field - which wasn't always the empty metadata entry. llvm-svn: 200731
* Handle spilling the PPC GPRC_NOR0 register classHal Finkel2014-01-281-0/+23
| | | | | | | GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288
* Add a TBAA CodeGen failure test caseHal Finkel2014-01-251-0/+41
| | | | | | | | | I disabled the use of TBAA in CodeGen in r200093. This adds a test case that demonstrates the problems with inttoptr and TBAA in CodeGen (and, specifically, the problem that causes LLVM to miscompile itself in Release mode). This test will currently fail if -use-tbaa-in-sched-mi is enabled. llvm-svn: 200097
* Fix pointer info on PPC byval storesHal Finkel2014-01-211-0/+17
| | | | | | | | | | | For PPC64 SVR (and Darwin), the stores that take byval aggregate parameters from registers into the stack frame had MachinePointerInfo objects with incorrect offsets. These offsets are relative to the object itself, not to the stack frame base. This fixes self hosting on PPC64 when compiling with -enable-aa-sched-mi. llvm-svn: 199763
* Implement initial-exec TLS for PPC32.Roman Divacky2013-12-201-0/+6
| | | | llvm-svn: 197824
* One ppc32-darwin, a i64 inside a structure can have 32 bit alignment.Rafael Espindola2013-12-181-2/+2
| | | | | | | | Thanks for Iain Sandoe for testing this with the original gcc. Clang was already getting this right. llvm-svn: 197572
* Add a reduced testcase from the recent bootstrap crash.Rafael Espindola2013-12-161-0/+17
| | | | llvm-svn: 197426
OpenPOWER on IntegriCloud