summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Make references to HexagonTargetMachine "const".Krzysztof Parzyszek2013-05-066-25/+26
| | | | llvm-svn: 181233
* Rotate multi-exit loops even if the latch was simplified.Andrew Trick2013-05-061-14/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test case by Michele Scandale! Fixes PR10293: Load not hoisted out of loop with multiple exits. There are few regressions with this patch, now tracked by rdar:13817079, and a roughly equal number of improvements. The regressions are almost certainly back luck because LoopRotate has very little idea of whether rotation is profitable. Doing better requires a more comprehensive solution. This checkin is a quick fix that lacks generality (PR10293 has a counter-example). But it trivially fixes the case in PR10293 without interfering with other cases, and it does satify the criteria that LoopRotate is a loop canonicalization pass that should avoid heuristics and special cases. I can think of two approaches that would probably be better in the long run. Ultimately they may both make sense. (1) LoopRotate should check that the current header would make a good loop guard, and that the loop does not already has a sufficient guard. The artifical SimplifiedLoopLatch check would be unnecessary, and the design would be more general and canonical. Two difficulties: - We need a strong guarantee that we won't endlessly rotate, so the analysis would need to be precise in order to avoid the SimplifiedLoopLatch precondition. - Analysis like this are usually based on SCEV, which we don't want to rely on. (2) Rotate on-demand in late loop passes. This could even be done by shoving the loop back on the queue after the optimization that needs it. This could work well when we find LICM opportunities in multi-branch loops. This requires some work, and it doesn't really solve the problem of SCEV wanting a loop guard before the analysis. llvm-svn: 181230
* R600: Remove dead code from the CodeEmitter v2Tom Stellard2013-05-063-400/+64
| | | | | | | | | v2: - Replace switch statement with TSFlags query Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181229
* R600: Emit config values in register / value pairsTom Stellard2013-05-062-3/+55
| | | | | | Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181228
* Remove unnecessary instance variable and rework logic accordingly.Eric Christopher2013-05-062-8/+5
| | | | llvm-svn: 181227
* Grammar.Eric Christopher2013-05-061-1/+2
| | | | llvm-svn: 181226
* R600: Stop emitting the instruction type byte before each instructionTom Stellard2013-05-061-33/+2
| | | | | | Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181225
* Don't emit .dwo sections unless they exist.Eric Christopher2013-05-061-24/+30
| | | | llvm-svn: 181224
* R600: Emit ISA for CALL_FS_* instructionsTom Stellard2013-05-061-1/+0
| | | | | | Reviewed-by: Vincent Lejeune <vljn@ovi.com> Tested-By: Aaron Watry <awatry@gmail.com> llvm-svn: 181223
* [SystemZ] Update non-pic DWARF encodingsUlrich Weigand2013-05-061-6/+13
| | | | | | | | | | | | As pointed out by Rafael Espindola, we should match the DWARF encodings produced by GCC in both pic and non-pic modes. This was not the case for the non-pic case. This patch changes all DWARF encodings to DW_EH_PE_absptr for the non-pic case, just like GCC does. The test case is updated to check for both variants. llvm-svn: 181222
* PowerPC: Fix unimplemented relocation on ppc64Adhemerval Zanella2013-05-061-0/+5
| | | | | | | This patch handles the R_PPC64_REL64 relocation type for powerpc64 for mcjit. llvm-svn: 181220
* Provide InstCombines for the following 3 cases:Jean-Luc Duprat2013-05-062-0/+53
| | | | | | | | | | | | | A * (1 - (uitofp i1 C)) -> select C, 0, A B * (uitofp i1 C) -> select C, B, 0 select C, 0, A + select C, B, 0 -> select C, B, A These come up in code that has been hand-optimized from a select to a linear blend, on platforms where that may have mattered. We want to undo such changes with the following transform: A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B llvm-svn: 181216
* [SystemZ] Add back endUlrich Weigand2013-05-0662-1/+10763
| | | | | | | | | | | | | | This adds the actual lib/Target/SystemZ target files necessary to implement the SystemZ target. Note that at this point, the target cannot yet be built since the configure bits are missing. Those will be provided shortly by a follow-on patch. This version of the patch incorporates feedback from reviews by Chris Lattner and Anton Korobeynikov. Thanks to all reviewers! Patch by Richard Sandiford. llvm-svn: 181203
* [SystemZ] Define DWARF encodingUlrich Weigand2013-05-061-0/+9
| | | | | | | | | | This is another patch in preparation for adding the SystemZ target. It defines the appropriate values for DWARF encodings; the intent is to be compatible with what GCC currently does on the target. Patch by Richard Sandiford. llvm-svn: 181201
* [PowerPC] Fix memory corruption in AsmParserUlrich Weigand2013-05-061-7/+7
| | | | | | | | As pointed out by Evgeniy Stepanov, assigning a std::string temporary to a StringRef is not a good idea. Rework MatchRegisterName to avoid using the .lower routine. llvm-svn: 181192
* Fix slightly too aggressive conact_vector optimization.Michael Kuperstein2013-05-061-0/+6
| | | | | | (Would sometimes optimize away conacts used to extend a vector with undef values) llvm-svn: 181186
* Update the comment to mention that we use TTI.Nadav Rotem2013-05-061-3/+3
| | | | llvm-svn: 181178
* Revert r164763 because it introduces new shuffles.Nadav Rotem2013-05-061-19/+1
| | | | | | Thanks Nick Lewycky for pointing this out. llvm-svn: 181177
* Fix unchecked uses of DominatorTree in MemoryDependenceAnalysis.Matt Arsenault2013-05-061-5/+20
| | | | | | Use unknown results for places where it would be needed llvm-svn: 181176
* Fix const merging when an alias of a const is llvm.used.Rafael Espindola2013-05-062-5/+13
| | | | | | | We used to disable constant merging not only if a constant is llvm.used, but also if an alias of a constant is llvm.used. This change fixes that. llvm-svn: 181175
* Add EH support to the MCJIT.Rafael Espindola2013-05-058-14/+153
| | | | | | | | | This gets exception handling working on ELF and Macho (x86-64 at least). Other than the EH frame registration, this patch also implements support for GOT relocations which are used to locate the personality function on MachO. llvm-svn: 181167
* ARM AnalyzeBranch should conservatively return true when it sees a predicatedEvan Cheng2013-05-051-3/+9
| | | | | | | | | | indirect branch at the end of the BB. Otherwise if-converter, branch folding pass may incorrectly update its successor info if it consider BB as fallthrough to the next BB. rdar://13782395 llvm-svn: 181161
* Teach if-converter to avoid removing BBs whose addresses are takne. ↵Evan Cheng2013-05-051-2/+19
| | | | | | rdar://13782395 llvm-svn: 181160
* LoopVectorize: Print values instead of pointers in debug output.Benjamin Kramer2013-05-051-4/+4
| | | | llvm-svn: 181157
* [XCore] Add LDAPB instructions.Richard Osborne2013-05-051-3/+13
| | | | | | | With the change the disassembler now supports the XCore ISA in its entirety. llvm-svn: 181155
* [XCore] Update LDAP to use pcrel_imm.Richard Osborne2013-05-051-3/+3
| | | | llvm-svn: 181154
* [XCore] Rename calltarget -> pcrel_imm.Richard Osborne2013-05-051-6/+6
| | | | | | No functionality change. llvm-svn: 181153
* [XCore] Add BLRB instructions.Richard Osborne2013-05-051-0/+7
| | | | llvm-svn: 181152
* [XCore] Remove '-' from back branch asm syntax.Richard Osborne2013-05-052-6/+18
| | | | | | | | Instead operands are treated as negative immediates where the sign bit is implicit in the instruction encoding. llvm-svn: 181151
* InlineSpiller: Remove quadratic behavior.Benjamin Kramer2013-05-051-8/+11
| | | | | | No functionality change. llvm-svn: 181149
* For ARM backend, fixed "byval" attribute support.Stepan Dyatkovskiy2013-05-053-34/+105
| | | | | | | | | | | | | | | | | | | | | | | Now even the small structures could be passed within byval (small enough to be stored in GPRs). In regression tests next function prototypes are checked: PR15293: %artz = type { i32 } define void @foo(%artz* byval %s) define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2) foo: "s" stored in R0 foo2: "s" stored in R0, "s2" stored in R2. Next AAPCS rules are checked: 5.5 Parameters Passing, C.4 and C.5, "ParamSize" is parameter size in 32bit words: -- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4. Parameter should be sent to the stack; NCRN := R4. -- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4. Parameter stored in GPRs; NCRN += ParamSize. llvm-svn: 181148
* Remove a recently redundant transform from X86ISelLowering.David Majnemer2013-05-051-11/+0
| | | | | | | | | | | | X86ISelLowering has support to treat: (icmp ne (and (xor %flags, -1), (shl 1, flag)), 0) as if it were actually: (icmp eq (and %flags, (shl 1, flag)), 0) However, r179386 has code at the InstCombine level to handle this. llvm-svn: 181145
* LoopVectorize: Add support for floating point min/max reductionsArnold Schwaighofer2013-05-051-22/+69
| | | | | | | | | | Add support for min/max reductions when "no-nans-float-math" is enabled. This allows us to assume we have ordered floating point math and treat ordered and unordered predicates equally. radar://13723044 llvm-svn: 181144
* LoopVectorizer: Cleanup of miminimum/maximum pattern match codeArnold Schwaighofer2013-05-051-2/+2
| | | | | | | | | No need for setting the operands. The pointers are going to be bound by the matcher. radar://13723044 llvm-svn: 181142
* LoopVectorize: We don't need an identity element for min/max reductionsArnold Schwaighofer2013-05-051-32/+19
| | | | | | | | | | We can just use the initial element that feeds the reduction. max(max(x, y), z) == max(max(x,y), max(x,z)) radar://13723044 llvm-svn: 181141
* Add ArrayRef constructor from None, and do the cleanups that this ↵Dmitri Gribenko2013-05-0515-24/+23
| | | | | | | | constructor enables Patch by Robert Wilhelm. llvm-svn: 181138
* whitespaceNadav Rotem2013-05-041-2/+2
| | | | llvm-svn: 181137
* Fix an odd comment.Nadav Rotem2013-05-041-2/+1
| | | | llvm-svn: 181136
* AArch64: enable MCJIT and tests now that everything passes.Tim Northover2013-05-041-1/+1
| | | | | | | This removes dire warnings about AArch64 being unsupported and enables the tests when appropriate on this platform. llvm-svn: 181135
* AArch64: implement 64-bit absolute relocation in MCJITTim Northover2013-05-041-0/+5
| | | | | | | | | This is about the simplest relocation, but surprisingly rare in actual code. It occurs in (for example) the MCJIT test test-ptr-reloc.ll. llvm-svn: 181134
* AArch64: add stubs to support long function calls on MCJITTim Northover2013-05-043-2/+84
| | | | | | | | | | | | | | | | | | | | As with global accesses, external functions could exist anywhere in memory. Therefore the stub must create a complete 64-bit address. This patch implements the fragment as (roughly): movz x16, #:abs_g3:somefunc movk x16, #:abs_g2_nc:somefunc movk x16, #:abs_g1_nc:somefunc movk x16, #:abs_g0_nc:somefunc br x16 In principle we could save 4 bytes by using a literal-load instead, but it is unclear that would be more efficient and can only be tested when real hardware is readily available. This allows (for example) the MCJIT test 2003-05-07-ArgumentTest to pass on AArch64. llvm-svn: 181133
* AArch64: implement relocations for global accessTim Northover2013-05-041-0/+31
| | | | | | | | | | | | | | | | | The large memory model (default and main viable for JIT) emits addresses in need of relocation as movz x0, #:abs_g3:somewhere movk x0, #:abs_g2_nc:somewhere movk x0, #:abs_g1_nc:somewhere movk x0, #:abs_g0_nc:somewhere To support this we must implement those four relocations in the dynamic loader. This allows (for example) the test-global.ll MCJIT test to pass on AArch64. llvm-svn: 181132
* AArch64: implement first relocation required for MCJITTim Northover2013-05-043-0/+45
| | | | | | | | | | | R_AARCH64_PCREL32 is present in even trivial .eh_frame sections and so is required to compile any function without the "nounwind" attribute. This change implements very basic infrastructure in the RuntimeDyldELF file and allows (for example) the test-shift.ll MCJIT test to pass on AArch64. llvm-svn: 181131
* Build system changes to enable MCJIT on AArch64Tim Northover2013-05-041-1/+1
| | | | | | | These changes just allow AArch64 to take part in the MCJIT world when built correctly. llvm-svn: 181130
* AArch64: use __clear_cache under GCCish environmentsTim Northover2013-05-041-1/+1
| | | | | | | | | AArch64 is going to need some kind of cache-invalidation in order to successfully JIT since it has a weak memory-model. This is provided by a __clear_cache builtin in libgcc, which acts very much like the 32-bit ARM equivalent (on platforms where it exists). llvm-svn: 181129
* Fix buildbot failure on 64 bit linux due to std::max() having differentRichard Osborne2013-05-041-1/+1
| | | | | | operand types. llvm-svn: 181128
* [XCore] Remove unused operand type.Richard Osborne2013-05-041-1/+0
| | | | llvm-svn: 181127
* [XCore] Make use of the target independent global address offset folding.Richard Osborne2013-05-046-98/+41
| | | | | | | | This let us to remove some custom code that matched constant offsets from globals at instruction selection time as a special addressing mode. No intended functionality change. llvm-svn: 181126
* [XCore] Simplify code that checks for an aligned base plus a constant.Richard Osborne2013-05-042-81/+56
| | | | | | | | | The code now makes use of ComputeMaskedBits, SelectionDAG::isBaseWithConstantOffset and TargetLowering::isGAPlusOffset where appropriate reducing the amount of logic needed in XCoreISelLowering. No intended functionality change. llvm-svn: 181125
* [XCore] Move lowering of thread local storage to a separate pass.Richard Osborne2013-05-046-55/+158
| | | | | | | | | | | | | Thread local storage is not supported by the XMOS linker so we handle thread local variables by lowering the variable to an array of n elements (where n is the number of hardware threads per core, currently 8 for all XMOS devices) indexed by the the current thread ID. Previously this lowering was spread across the XCoreISelLowering and the XCoreAsmPrinter classes. Moving this to a separate pass should be much cleaner. llvm-svn: 181124
OpenPOWER on IntegriCloud