summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [WinEH] Emit __C_specific_handler tables for the new IRReid Kleckner2015-10-013-61/+220
| | | | | | | | | | | | We emit denormalized tables, where every range of invokes in the same state gets a complete list of EH action entries. This is significantly simpler than trying to infer the correct nested scoping structure from the MI. Fortunately, for SEH, the nesting structure is really just a size optimization. With this, some basic __try / __except examples work. llvm-svn: 249078
* [WinEH] Stop BranchFolding from merging across funcletsDavid Majnemer2015-10-012-13/+7
| | | | | | | BranchFolding would merge two funclets together, this is not OK. Disable this and strengthen the assertion in FuncletLayout. llvm-svn: 249069
* [WinEH] Make FuncletLayout more robust against catchretDavid Majnemer2015-10-014-21/+131
| | | | | | | | | Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. llvm-svn: 249052
* Reformat.NAKAMURA Takumi2015-10-011-1/+2
| | | | llvm-svn: 249033
* Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"NAKAMURA Takumi2015-10-012-3/+8
| | | | | | It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll llvm-svn: 249032
* [WinEH] Emit int3 after noreturn calls on Win64Reid Kleckner2015-09-302-8/+3
| | | | | | | | | | | | | | | | | | | | | | | The Win64 unwinder disassembles forwards from each PC to try to determine if this PC is in an epilogue. If so, it skips calling the EH personality function for that frame. Typically, this means you cannot catch an exception in the same frame that you threw it, because 'throw' calls a noreturn runtime function. Previously we avoided this problem with the TrapUnreachable TargetOption, but that's a much bigger hammer than we need. All we need is a 1 byte non-epilogue instruction right after the call. Instead, what we got was an unconditional branch to a shared block containing the ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which added TrapUnreachable, and replaces it with something better. The new code pattern matches for invoke/call followed by unreachable and inserts an int3 into the DAG. To be 100% watertight, we would need to insert SEH_Epilogue instructions into all basic blocks ending in a call with no terminators or successors, but in practice this is unlikely to come up. llvm-svn: 248959
* Fix debug info with SafeStack.Evgeniy Stepanov2015-09-302-10/+13
| | | | llvm-svn: 248933
* HHVM calling conventions.Maksim Panchenko2015-09-292-15/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | HHVM calling convention, hhvmcc, is used by HHVM JIT for functions in translated cache. We currently support LLVM back end to generate code for X86-64 and may support other architectures in the future. In HHVM calling convention any GP register could be used to pass and return values, with the exception of R12 which is reserved for thread-local area and is callee-saved. Other than R12, we always pass RBX and RBP as args, which are our virtual machine's stack pointer and frame pointer respectively. When we enter translation cache via hhvmcc function, we expect the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed to standard ABI alignment. This affects stack object alignment and stack adjustments for function calls. One extra calling convention, hhvm_ccc, is used to call C++ helpers from HHVM's translation cache. It is almost identical to standard C calling convention with an exception of first argument which is passed in RBP (before we use RDI, RSI, etc.) Differential Revision: http://reviews.llvm.org/D12681 llvm-svn: 248832
* [WinEH] Teach AsmPrinter about funcletsDavid Majnemer2015-09-295-30/+142
| | | | | | | | | | | Summary: Funclets have been turned into functions by the time they hit the object file. Make sure that they have decent names for the symbol table and CFI directives explaining how to reason about their prologues. Differential Revision: http://reviews.llvm.org/D13261 llvm-svn: 248824
* Rename some function arguments in MachineBasicBlock.cpp/h by turning the ↵Cong Hou2015-09-291-55/+55
| | | | | | first letter into upper case. NFC. llvm-svn: 248821
* Arguments spilled on the stack before a function call may haveJeroen Ketema2015-09-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | alignment requirements, for example in the case of vectors. These requirements are exploited by the code generator by using move instructions that have similar alignment requirements, e.g., movaps on x86. Although the code generator properly aligns the arguments with respect to the displacement of the stack pointer it computes, the displacement itself may cause misalignment. For example if we have %3 = load <16 x float>, <16 x float>* %1, align 64 call void @bar(<16 x float> %3, i32 0) the x86 back-end emits: movaps 32(%ecx), %xmm2 movaps (%ecx), %xmm0 movaps 16(%ecx), %xmm1 movaps 48(%ecx), %xmm3 subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such. movl $0, 16(%esp) calll __bar To solve this, we need to make sure that the computed value with which the stack pointer is changed is a multiple af the maximal alignment seen during its computation. With this change we get proper alignment: subl $32, %esp movaps %xmm3, (%esp) Differential Revision: http://reviews.llvm.org/D12337 llvm-svn: 248786
* RegisterPressure: LiveRegSet tracks register units not physregsMatthias Braun2015-09-291-1/+1
| | | | | | | There are always more physical registers and register units so the previous behaviour was correct but we can do with less memory. llvm-svn: 248767
* [WinEH] Fix ip2state table emission with funcletsReid Kleckner2015-09-285-56/+88
| | | | | | | Previously we were hijacking the old LandingPadInfo data structures to communicate our state numbers. Now we don't need that anymore. llvm-svn: 248763
* Fix unused variable warning in non-debug builds.Richard Trieu2015-09-281-1/+2
| | | | llvm-svn: 248754
* tidy up comments; NFCSanjay Patel2015-09-281-7/+7
| | | | llvm-svn: 248750
* move one-use check under the comment that describes it; NFCISanjay Patel2015-09-281-3/+2
| | | | llvm-svn: 248745
* Improved the interface of methods commuting operands, improved X86-FMA3 ↵Andrew Kaylor2015-09-283-67/+148
| | | | | | | | | | mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 llvm-svn: 248735
* [DAGCombine] Fix getStoreMergeAndAliasCandidates's AA-enabled chain walkingHal Finkel2015-09-281-0/+2
| | | | | | | | | | | | | | | | When AA is being used, non-aliasing stores are canonicalized to use the same chain, and DAGCombiner::getStoreMergeAndAliasCandidates can take advantage of this by looking only as users of a store's chain operand. However, user iteration is not result-number specific, we need to check that the use is as a chain operand, and not via some other operand. It is certainly possible to have another potentially-aliasing store, which shares the first's base pointer, and uses the first's chain's node via some other operand. Failure to catch this situation caused, at least in the included test case, an assert later because the relative sequence-number ordering caused later replacement to create a cycle in the DAG. llvm-svn: 248698
* Remove 'const' from some ArrayRefs. ArrayRefs are already immutable. NFCCraig Topper2015-09-281-2/+2
| | | | llvm-svn: 248693
* [EH] Create removeUnwindEdge utilityJoseph Tremoulet2015-09-271-0/+14
| | | | | | | | | | | | | | | | | Summary: Factor the code that rewrites invokes to calls and rewrites WinEH terminators to their "unwind to caller" equivalents into a helper in Utils/Local, and use it in the three places I'm aware of that need to do this. Reviewers: andrew.w.kaylor, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13152 llvm-svn: 248677
* LivePhysRegs: Fix live-outs of return blocksMatthias Braun2015-09-251-2/+10
| | | | | | | | | | | | | I realized that the live-out set computed for the return block is missing the callee saved registers (the non-pristine ones to be exact). This only affects the liveness computed for instructions inside the function epilogue which currently none of the LivePhysRegs users in llvm cares about, so this is just a drive-by fix without a testcase. Differential Revision: http://reviews.llvm.org/D13180 llvm-svn: 248636
* SelectionDAGDumper: Print simple operands inline.Matthias Braun2015-09-251-22/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print simple operands inline instead of their pointer/value number. Simple operands are SDNodes without predecessors like Constant(FP), Register, UNDEF. This unifies the behaviour with dumpr() which was already doing this. Previously: t0: ch = EntryToken t1: i64 = Register %vreg0 t2: i64,ch = CopyFromReg t0, t1 t3: i64 = Constant<1> t4: i64 = add t2, t3 t5: i64 = Constant<2> t6: i64 = add t2, t5 t10: i64 = undef t11: i8,ch = load t0, t2, t10<LD1[%tmp81]> t12: i8,ch = load t0, t4, t10<LD1[%tmp10]> t13: i8,ch = load t0, t6, t10<LD1[%tmp12]> Now: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64 = add t2, Constant:i64<1> t6: i64 = add t2, Constant:i64<2> t11: i8,ch = load<LD1[%tmp81]> t0, t2, undef:i64 t12: i8,ch = load<LD1[%tmp10]> t0, t4, undef:i64 t13: i8,ch = load<LD1[%tmp12]> t0, t6, undef:i64 Differential Revision: http://reviews.llvm.org/D12567 llvm-svn: 248628
* DAGCombiner: Check if store is volatile firstMatt Arsenault2015-09-251-3/+3
| | | | | | This is the simpler check. NFC. llvm-svn: 248625
* TargetRegisterInfo: Introduce PrintLaneMask.Matthias Braun2015-09-259-26/+25
| | | | | | | This makes it more convenient to print lane masks and lead to more uniform printing. llvm-svn: 248624
* TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where ↵Matthias Braun2015-09-2512-88/+90
| | | | | | apropriate; NFC llvm-svn: 248623
* merge vector stores into wider vector stores and fix AArch64 misaligned ↵Sanjay Patel2015-09-251-11/+24
| | | | | | | | | | | | | | | | | | | | | | access TLI hook (PR21711) This is a redo of D7208 ( r227242 - http://llvm.org/viewvc/llvm-project?view=revision&revision=227242 ). The patch was reverted because an AArch64 target could infinite loop after the change in DAGCombiner to merge vector stores. That happened because AArch64's allowsMisalignedMemoryAccesses() wasn't telling the truth. It reported all unaligned memory accesses as fast, but then split some 128-bit unaligned accesses up in performSTORECombine() because they are slow. This patch attempts to fix the problem in AArch's allowsMisalignedMemoryAccesses() while preserving existing (perhaps questionable) lowering behavior. The x86 test shows that store merging is working as intended for a target with fast 32-byte unaligned stores. Differential Revision: http://reviews.llvm.org/D12635 llvm-svn: 248622
* PrologueEpilogInserter: Fix missing live-ins when savepoint equals restorepointMatthias Braun2015-09-251-3/+6
| | | | | | | | | | | | | | The algorithm would not modify the live-in list of blocks below the save block point which is correct unless it happens to be a restore point at the same time. Also fixes the benign issue of live-in registers being added twice in some cases. The testcase is based on a test submitted by Kit Barton. Differential Revision: http://reviews.llvm.org/D13176 llvm-svn: 248620
* MachineBasicBlock: Factor out common code into isReturnBlock()Matthias Braun2015-09-253-11/+4
| | | | llvm-svn: 248617
* PeepholeOptimizer: Remove redundant copiesMatt Arsenault2015-09-251-0/+79
| | | | | | | | | | | | If a virtual register is copied and another copy was already seen, replace with the previous copy. This only handles the simplest cases for now. This pattern shows up from various operand restrictions AMDGPU has which require inserting copies depending on the register class of the operands. llvm-svn: 248611
* Simplify code. NFC.Chad Rosier2015-09-251-6/+1
| | | | llvm-svn: 248610
* Fix typoMatt Arsenault2015-09-241-1/+1
| | | | llvm-svn: 248549
* Codegen: Fix llvm.*absdiff semantic.Mohammad Shahid2015-09-241-16/+22
| | | | | | | | Fixes the overflow case of llvm.*absdiff intrinsic also updats the tests and LangRef.rst accordingly. Differential Revision: http://reviews.llvm.org/D11678 llvm-svn: 248483
* Introduce target hook for optimizing register copiesMatt Arsenault2015-09-242-34/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow a target to do something other than search for copies that will avoid cross register bank copies. Implement for SI by only rewriting the most basic copies, so it should look through anything like a subregister extract. I'm not entirely satisified with this because it seems like eliminating a reg_sequence that isn't fully used should work generically for all targets without them having to override something. However, it seems to be tricky to have a simple implementation of this without rewriting to invalid kinds of subregister copies on some targets. I'm not sure if there is currently a generic way to easily check if a subregister index would be valid for the current use. The current set of TargetRegisterInfo::get*Class functions don't quite behave like I would expect (e.g. getSubClassWithSubReg returns the maximal register class rather than the minimal), so I'm not sure how to make the generic test keep searching if SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making the default implementation to check for simple copies breaks a variety of ARM and x86 tests by producing illegal subregister uses. The ARM tests are not actually changed since it should still be using the same sharesSameRegisterFile implementation, this just relaxes them to not check for specific registers. llvm-svn: 248478
* Remove dead declarationMatt Arsenault2015-09-241-1/+0
| | | | llvm-svn: 248471
* Use new TokenFactor chain when merging storesMatt Arsenault2015-09-241-5/+29
| | | | | | | | | | | | | | | | | | | | | If the stores are storing values from loads which partially alias the stores, we could end up placing the merged loads and stores on the same chain which has the potential to break. Each store may have a different chain dependency on only some of the original loads. Create a new TokenFactor to capture all of the required dependencies of the stores rather than assuming all stores can use the same chain. The testcase is a situation where this happens, although it does not have an observable change from this. The DAG nodes just happened to not be reordered before despite this missing chain dependency. This is based on an off-list report for an out of tree target which regressed due to r246307 and I haven't managed to find a case where the nodes do end up reordered with an in tree target. llvm-svn: 248468
* Android support for SafeStack.Evgeniy Stepanov2015-09-231-1/+1
| | | | | | | | | | | | | | | | | Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). This is a re-commit of a change in r248357 that was reverted in r248358. llvm-svn: 248405
* Revert "Android support for SafeStack."Evgeniy Stepanov2015-09-231-1/+1
| | | | | | | test/Transforms/SafeStack/abi.ll breaks when target is not supported; needs refactoring. llvm-svn: 248358
* Android support for SafeStack.Evgeniy Stepanov2015-09-231-1/+1
| | | | | | | | | | | | | | Add two new ways of accessing the unsafe stack pointer: * At a fixed offset from the thread TLS base. This is very similar to StackProtector cookies, but we plan to extend it to other backends (ARM in particular) soon. Bionic-side implementation here: https://android-review.googlesource.com/170988. * Via a function call, as a fallback for platforms that provide neither a fixed TLS slot, nor a reasonable TLS implementation (i.e. not emutls). llvm-svn: 248357
* Fixed an issue on updating profile data when lowering switch statement.Cong Hou2015-09-231-4/+4
| | | | | | Fixed the issue that when there is an edge from the jump table to the default statement, we should check it directly instead of checking if the sibling of the jump table header is a successor of the jump table header, which may not be the default statment but a successor of it. llvm-svn: 248354
* Debug Info: Emit the dwo_name only in skeleton CUs, not in DWOs.Adrian Prantl2015-09-221-3/+5
| | | | llvm-svn: 248340
* LiveIntervalAnalysis: Avoid multiple connected liveness componentsMatthias Braun2015-09-221-8/+26
| | | | | | | | | | | | | We may have subregister defs which are unused but not discovered and cleaned up prior to liveness analysis. This creates multiple connected components in the resulting live range which are forbidden in the MachineVerifier because they would unnecesarily constrain the register allocator. Rewrite those dead definitions to define a newly created virtual register. Differential Revision: http://reviews.llvm.org/D13035 llvm-svn: 248335
* LiveInterval: Distribute subregister liveranges to new intervals in ↵Matthias Braun2015-09-221-29/+65
| | | | | | | | | | | | | | | | ConnectedVNInfoEqClasses::Distribute() This improves ConnectedVNInfoEqClasses::Distribute() to distribute the segments and value numbers in the subranges instead of conservatively clearing all subregister info. No separate test here, just clearing the subrange instead of properly distributing them would however break my upcoming fix regarding dead super register definitions. Differential Revision: http://reviews.llvm.org/D13075 llvm-svn: 248334
* [AArch64] Emit clrex in the expanded cmpxchg fail block.Ahmed Bougacha2015-09-221-3/+14
| | | | | | | | | | | | | | | | | In the comparison failure block of a cmpxchg expansion, the initial ldrex/ldxr will not be followed by a matching strex/stxr. On ARM/AArch64, this unnecessarily ties up the execution monitor, which might have a negative performance impact on some uarchs. Instead, release the monitor in the failure block. The clrex instruction was designed for this: use it. Also see ARMARM v8-A B2.10.2: "Exclusive access instructions and Shareable memory locations". Differential Revision: http://reviews.llvm.org/D13033 llvm-svn: 248291
* Make helper function static. NFC.Benjamin Kramer2015-09-221-3/+3
| | | | llvm-svn: 248278
* Untabify.NAKAMURA Takumi2015-09-222-4/+4
| | | | llvm-svn: 248264
* Reformat blank lines.NAKAMURA Takumi2015-09-223-5/+0
| | | | llvm-svn: 248263
* Reformat comment lines.NAKAMURA Takumi2015-09-221-4/+4
| | | | llvm-svn: 248262
* Reformat.NAKAMURA Takumi2015-09-221-3/+2
| | | | llvm-svn: 248261
* LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFCMatthias Braun2015-09-225-58/+51
| | | | llvm-svn: 248241
* function names should start with a lower case letter; NFCSanjay Patel2015-09-211-91/+91
| | | | llvm-svn: 248224
OpenPOWER on IntegriCloud