summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DIBuilder] Make createReferenceType take size and alignKeno Fischer2015-11-161-1/+3
| | | | | | | | | | | | | | | | | | Summary: Since we're passing references to dbg.value as pointers, we need to have the frontend properly declare their sizes and alignments (as it already does for regular pointers) in preparation for my upcoming patch to have the verifer check that the sizes agree. Also augment the backend logic that skips actually emitting this information into DWARF such that it also handles reference types. Reviewers: aprantl, dexonsmith, dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D14275 llvm-svn: 253186
* Reduce the size of MCRelaxableFragment.Akira Hatanaka2015-11-142-9/+4
| | | | | | | | | | | | | | | | | | | | | | MCRelaxableFragment previously kept a copy of MCSubtargetInfo and MCInst to enable re-encoding the MCInst later during relaxation. A copy of MCSubtargetInfo (instead of a reference or pointer) was needed because the feature bits could be modified by the parser. This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment with a constant reference to MCSubtargetInfo. The copies of MCSubtargetInfo are kept in MCContext, and the target parsers are now responsible for asking MCContext to provide a copy whenever the feature bits of MCSubtargetInfo have to be toggled. With this patch, I saw a 4% reduction in peak memory usage when I compiled verify-uselistorder.lto.bc using llc. rdar://problem/21736951 Differential Revision: http://reviews.llvm.org/D14346 llvm-svn: 253127
* [ShrinkWrapping] Disable the optimization for functions with sanitize likeQuentin Colombet2015-11-141-1/+8
| | | | | | | | | | attribute. Even if the target supports shrink-wrapping, the prologue and epilogue must not move because a crash can happen anywhere and sanitizers need to be able to unwind from the PC of the crash. llvm-svn: 253116
* MachineScheduler: Print initial pressure in debug dumpMatthias Braun2015-11-131-0/+7
| | | | llvm-svn: 253097
* MachineScheduler: Improve debug output for "only one node in readyset"Matthias Braun2015-11-131-2/+2
| | | | | | | When there is only 1 node left in the ready queue and it is picked call the reason "ONLY1" instead of "NOCAND". llvm-svn: 253096
* [SDAG] Fix expansion of BITREVERSEJames Molloy2015-11-131-3/+5
| | | | | | | | | | Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash. What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size. Use an APInt for the shift and VT.getScalarSizeInBits(). llvm-svn: 253023
* [ImplicitNulls] Add some clarifying comments; NFCSanjoy Das2015-11-131-1/+25
| | | | llvm-svn: 253020
* Revert "Remove unnecessary call to getAllocatableRegClass"Tom Stellard2015-11-121-8/+4
| | | | | | | | | | | | | This reverts commit r252565. This also includes the revert of the commit mentioned below in order to avoid breaking tests in AMDGPU: Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64" This reverts commit r252674. llvm-svn: 252956
* [ImplicitNulls] Fix wrapping by breaking up a condition, NFCSanjoy Das2015-11-121-4/+4
| | | | llvm-svn: 252947
* [ImplicitNull] Extract out a HazardDetector class, NFCSanjoy Das2015-11-121-53/+96
| | | | | | This will make later functional changes easier to follow. llvm-svn: 252946
* [ShrinkWrap] Fix a typo in a comment.Quentin Colombet2015-11-121-1/+1
| | | | llvm-svn: 252918
* [ShrinkWrap] Make sure we do not mess up with EH funclet lowering.Quentin Colombet2015-11-121-1/+11
| | | | | | | | ShrinkWrapping does not understand exception handling constraints for now, so make sure we do not mess with them by aborting on functions that use EH funclets. llvm-svn: 252917
* [WinEH] Fix problem with removing an element from a SetVector while iterating.Andrew Kaylor2015-11-121-11/+5
| | | | | | Patch provided by Yaron Keren. (Thanks!) llvm-svn: 252913
* [SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsicJames Molloy2015-11-127-2/+64
| | | | | | | | | | Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target. This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support. The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently). llvm-svn: 252878
* LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalizationMatthias Braun2015-11-121-73/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al Differential Revision: http://reviews.llvm.org/D11172 llvm-svn: 252839
* [WinEH] Don't forward branches across empty EH pad BBsReid Kleckner2015-11-112-1/+2
| | | | | | | For really simple SEH catchpads, we tried to forward the invoke unwind edge across the empty block. llvm-svn: 252822
* [DAGCombiner] Improve zextload optimization.Geoff Berry2015-11-111-22/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Don't fold (zext (and (load x), cst)) -> (and (zextload x), (zext cst)) if (and (load x) cst) will match as a zextload already and has additional users. For example, the following IR: %load = load i32, i32* %ptr, align 8 %load16 = and i32 %load, 65535 %load64 = zext i32 %load16 to i64 store i32 %load16, i32* %dst1, align 4 store i64 %load64, i64* %dst2, align 8 used to produce the following aarch64 code: ldr w8, [x0] and w9, w8, #0xffff and x8, x8, #0xffff str w9, [x1] str x8, [x2] but with this change produces the following aarch64 code: ldrh w8, [x0] str w8, [x1] str x8, [x2] Reviewers: resistor, mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14340 llvm-svn: 252789
* Add target preference for GatherAllAliases max depthMatt Arsenault2015-11-112-1/+2
| | | | llvm-svn: 252775
* clang-format lib/CodeGen/AsmPrinter/DwarfCompileUnit.cppDehao Chen2015-11-111-7/+8
| | | | llvm-svn: 252769
* Emit discriminator for inlined callsites.Dehao Chen2015-11-111-0/+2
| | | | | | | | | | | | Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance. Reviewers: dnovillo, dblaikie Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D14511 llvm-svn: 252768
* MachineInstr: addRegisterDefReadUndef() => setRegisterDefReadUndef()Matthias Braun2015-11-112-3/+3
| | | | | | This way we can not only add but also remove read undef flags. llvm-svn: 252678
* TableGen: Emit LaneMask for register classes without subregisters as ~0uMatthias Braun2015-11-101-10/+13
| | | | | | | This makes it slightly easier to handle classes with and without subregister uniformly. llvm-svn: 252671
* less indent; NFCISanjay Patel2015-11-101-46/+47
| | | | llvm-svn: 252643
* LegalizeDAG: Implement promote for scalar_to_vectorMatt Arsenault2015-11-101-0/+28
| | | | | | | | | | | This allows avoiding the default Expand behavior which introduces stack usage. Bitcast the scalar and replace the missing elements with undef. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252632
* LegalizeDAG: Implement promote for insert_vector_eltMatt Arsenault2015-11-101-1/+52
| | | | | | | This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252631
* LegalizeDAG: Implement promote for extract_vector_eltMatt Arsenault2015-11-101-4/+58
| | | | | | | | | | This is for AMDGPU to implement v2i64 extract as extract of half of a v4i32. This is covered by existing tests and used by a future commit which makes 64-bit vectors legal types on AMDGPU. llvm-svn: 252630
* add 'MustReduceDepth' as an objective/cost-metric for the MachineCombinerSanjay Patel2015-11-101-29/+53
| | | | | | | | | | | | | | | | | | | | | | This is one of the problems noted in PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016 and: http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html The spilling problem is independent and not addressed by this patch. The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. This is caused by inclusion of the "slack" factor when calculating the critical path of the original code sequence. If we don't add that, then we have a more conservative cost comparison of the old code sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64 MULADD patterns because benchmark regressions were observed without that. The two failing test cases now have identical asm that does what we want: a + b + c + d ---> (a + b) + (c + d) Differential Revision: http://reviews.llvm.org/D13417 llvm-svn: 252616
* Support for emitting inline stack probesAndy Ayers2015-11-101-0/+3
| | | | | | | | | | | | | | | | | | For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages between the current stack limit and the desired new stack pointer location. This implements support for the inline expansion on x64. For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications that arise when introducing new machine basic blocks during prolog and epilog creation. Added a new test case, modified an existing one to exclude non-x64 coreclr (for now). Add test case Fix tests llvm-svn: 252578
* Remove unnecessary call to getAllocatableRegClassMatt Arsenault2015-11-101-4/+8
| | | | | | | | | | | | | | | | | | | | | | I'm not sure what the point of this was. I'm not sure why you would ever define an instruction that produces an unallocatable register class. No tests fail with this removed, and it seems like it should be a verifier error to define such an instruction. This was problematic for AMDGPU because it would make bad decisions by arbitrarily changing the register class when unsetting isAllocatable for VS_32/VS_64, which is currently set as a workaround to this problem. AMDGPU uses the VS_32/VS_64 register classes to represent operands which can use either VGPRs or SGPRs. When isAllocatable is unset for these, this would need to pick either the SGPR or VGPR class and insert either a copy we don't want, or an illegal copy we would need to deal with later. A semi-arbitrary register class ordering decision is made in tablegen, which resulted in always picking a VGPR class because it happens to have more registers than the SGPR register class. We really just want to use whatever register class the original register had. llvm-svn: 252565
* MachineVerifier: Streamline live interval related error reportingMatthias Braun2015-11-091-90/+93
| | | | | | | | | Simply perform additional report_context() calls after a report() instead of adding more and more overloaded variations of report(). Also improve several instances where information was output in an ad-hoc way probably because no matching report() overload was available. llvm-svn: 252552
* MachineVerifier: Add missing linebreakMatthias Braun2015-11-091-0/+1
| | | | | | | MachineInstr::print() with SkipOppers==true does not produce a linebreak, so we have to do that in MachineVerifier::report(). llvm-svn: 252551
* MachineVerifier: MI::print has no TargetMachine overloadMatthias Braun2015-11-091-1/+1
| | | | | | | The code was passing a target machine pointer which degraded to a true operand to SkipOppers. llvm-svn: 252550
* MachineVerifier: print list of live intervals if availableMatthias Braun2015-11-091-1/+4
| | | | llvm-svn: 252549
* add a SelectionDAG method to check if no common bits are set in two nodes; NFCISanjay Patel2015-11-092-16/+13
| | | | | | | | | | | | | | | This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. llvm-svn: 252539
* [WinEH] Don't emit CATCHRET from visitCatchPadDavid Majnemer2015-11-091-12/+5
| | | | | | | Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528
* [WinEH] Tweak funclet prologue/epilogue insertion to pass verifierReid Kleckner2015-11-091-1/+4
| | | | | | | | | | For some reason we'd never run MachineVerifier on WinEH code, and you explicitly have to ask for it with llc. I added it to a few test cases to get some coverage. Fixes PR25461. llvm-svn: 252512
* [WinEH] Re-committing r252249 (Clone funclets with multiple parents) with ↵Andrew Kaylor2015-11-091-34/+1005
| | | | | | | | additional fixes for determinism problems Differential Revision: http://reviews.llvm.org/D14454 llvm-svn: 252508
* [CodeGen] Always promote f16 if not legalOliver Stannard2015-11-093-13/+30
| | | | | | | | | | | | | | | | | | | We don't currently have any runtime library functions for operations on f16 values (other than conversions to and from f32 and f64), so we should always promote it to f32, even if that is not a legal type. In that case, the f32 values would be softened to f32 library calls. SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type, as it may ne a no-op or require a different library call. getCopyFromParts and getCopyToParts now need to cope with a floating-point value stored in a larger integer part, as is the case for any target that needs to store an f16 value in a 32-bit integer register. Differential Revision: http://reviews.llvm.org/D12856 llvm-svn: 252459
* Erase unused FunctionDIs variables after r252219.Yaron Keren2015-11-072-3/+0
| | | | llvm-svn: 252401
* [WinEH] Update exception pointer registersJoseph Tremoulet2015-11-073-7/+7
| | | | | | | | | | | | | | | | | | | | Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383
* DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> ↵Tom Stellard2015-11-061-1/+2
| | | | | | | | | | | | extload Reviewers: resistor, arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13805 llvm-svn: 252349
* [ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask.Quentin Colombet2015-11-061-8/+38
| | | | | | | Previously we were conservatively assuming that RegMask operands clobber callee saved registers. llvm-svn: 252341
* MachineScheduler: Add regpressure information to debug dumpMatthias Braun2015-11-063-9/+36
| | | | llvm-svn: 252340
* [WinEH] Mark funclet entries and exits as clobbering all registersReid Kleckner2015-11-062-0/+28
| | | | | | | | | | | | | | | | | Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318
* Revert r252249 (and r252255, r252258), "[WinEH] Clone funclets with multiple ↵NAKAMURA Takumi2015-11-061-977/+14
| | | | | | | | parents" It behaved flaky due to iterating pointer key values on std::set and std::map. llvm-svn: 252279
* Range-for some LiveIntervals code under reviewReid Kleckner2015-11-061-9/+7
| | | | llvm-svn: 252267
* Fix build warningsAndrew Kaylor2015-11-061-4/+4
| | | | llvm-svn: 252255
* [WinEH] Clone funclets with multiple parentsAndrew Kaylor2015-11-061-14/+977
| | | | | | | | | | Windows EH funclets need to always return to a single parent funclet. However, it is possible for earlier optimizations to combine funclets (probably based on one funclet having an unreachable terminator) in such a way that this condition is violated. These changes add code to the WinEHPrepare pass to detect situations where a funclet has multiple parents and clone such funclets, fixing up the unwind and catch return edges so that each copy of the funclet returns to the correct parent funclet. Differential Revision: http://reviews.llvm.org/D13274?id=39098 llvm-svn: 252249
* DI: Reverse direction of subprogram -> function edge.Peter Collingbourne2015-11-052-7/+4
| | | | | | | | | | | | | | | | | | | | | | | Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219
* [WinEH] Fix funclet prologues with stack realignmentReid Kleckner2015-11-053-14/+17
| | | | | | | | | | | | | | We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210
OpenPOWER on IntegriCloud