summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* revert r243687: enable fast-math-flag propagation to DAG nodes Sanjay Patel2015-08-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can't propagate FMF partially without breaking DAG-level CSE. We either need to relax CSE to account for mismatched FMF as a temporary work-around or fully propagate FMF throughout the DAG. Surprisingly, there are no existing regression tests for this, but here's an example: define float @fmf(float %a, float %b) { %mul1 = fmul fast float %a, %b %nega = fsub fast float 0.0, %a %mul2 = fmul fast float %nega, %b %abx2 = fsub fast float %mul1, %mul2 ret float %abx2 } $ llc -o - badflags.ll -march=x86-64 -mattr=fma -enable-unsafe-fp-math -enable-fmf-dag=0 ... vmulss %xmm1, %xmm0, %xmm0 vaddss %xmm0, %xmm0, %xmm0 retq $ llc -o - badflags.ll -march=x86-64 -mattr=fma -enable-unsafe-fp-math -enable-fmf-dag=1 ... vmulss %xmm1, %xmm0, %xmm2 vfmadd213ss %xmm2, %xmm1, %xmm0 <--- failed to recognize that (a * b) was already calculated retq llvm-svn: 244053
* [MachineCombiner] Don't use the opcode-only form of computeInstrLatencyHal Finkel2015-08-051-1/+1
| | | | | | | | | In r242277, I updated the MachineCombiner to work with itineraries, but I missed a call that is scheduling-model-only (the opcode-only form of computeInstrLatency). Using the form that takes an MI* allows this to work with itineraries (and should be NFC for subtargets with scheduling models). llvm-svn: 244020
* wrap OptSize and MinSize attributes for easier and consistent access (NFCI)Sanjay Patel2015-08-049-9/+10
| | | | | | | | | | | | | | | | | Create wrapper methods in the Function class for the OptimizeForSize and MinSize attributes. We want to hide the logic of "or'ing" them together when optimizing just for size (-Os). Currently, we are not consistent about this and rely on a front-end to always set OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here that should be added as follow-on patches with regression tests. This patch is NFC-intended: it just replaces existing direct accesses of the attributes by the equivalent wrapper call. Differential Revision: http://reviews.llvm.org/D11734 llvm-svn: 243994
* [SDAG] Fix a result chain in ExpandUnalignedLoadHal Finkel2015-08-041-1/+1
| | | | | | | | | | | | On the code path in ExpandUnalignedLoad which expands an unaligned vector/fp value in terms of a legal integer load of the same size, the ChainResult needs to be the chain result of the integer load. No in-tree test case is currently available. Patch by Jan Hranac! llvm-svn: 243956
* Introduce enum value for previously defined metadata -- make.implicit Chen Li2015-08-041-1/+2
| | | | | | | | | | | | Summary: This patch adds enum value for an existing metadata type -- make.implicit. Using preassigned enum will be helpful to get compile time type checking and avoid string construction and comparison. The patch also changes uses of make.implicit from string metadata to enum metadata. There is no functionality change. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11698 llvm-svn: 243954
* [CodeGen] Fix FCOPYSIGN legalization to account for mismatched types.Ahmed Bougacha2015-08-042-2/+53
| | | | | | | | | | | | | | We used to legalize it like it's any other binary operations. It's not, because it accepts mismatched operand types. Because of that, we used to hit various asserts and miscompiles. Specialize vector legalizations to, in the worst case, unroll, or, when possible, to just legalize the operand that needs legalization. Scalarization isn't covered, because I can't think of a target where some but not all of the 1-element vector types are to be scalarized. llvm-svn: 243924
* MIR Serialization: Serialize the 'volatile' machine memory operand flag.Alex Lorenz2015-08-044-2/+26
| | | | llvm-svn: 243923
* MIR Serialization: Initial serialization of the machine memory operands.Alex Lorenz2015-08-034-7/+173
| | | | | Reviewers: Duncan P. N. Exon Smith llvm-svn: 243915
* -Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are ↵David Blaikie2015-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | deprecated in C++11 Various value handles needed to be copy constructible and copy assignable (mostly for their use in DenseMap). But to avoid an API that might allow accidental slicing, make these members protected in the base class and make derived classes final (the special members become implicitly public there - but disallowing further derived classes that might be sliced to the intermediate type). Might be worth having a warning a bit like -Wnon-virtual-dtor that catches public move/copy assign/ctors in classes with virtual functions. (suppressable in the same way - by making them protected in the base, and making the derived classes final) Could be fancier and only diagnose them when they're actually called, potentially. Also allow a few default implementations where custom implementations (especially with non-standard return types) were implemented. llvm-svn: 243909
* -Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are ↵David Blaikie2015-08-031-5/+8
| | | | | | | | | | deprecated in C++11 Some functions return concrete ByteStreamers by value - explicitly support that in the base class. (dtor can be virtual, no one seems to be polymorphically owning/destroying them) llvm-svn: 243897
* Refactor AtomicExpand::expandAtomicRMWToCmpXchg into a standalone function.JF Bastien2015-08-031-66/+82
| | | | | | | | | | | | | | | Summary: This is useful for PNaCl's `RewriteAtomics` pass. NaCl intrinsics don't exist for some of the more exotic RMW instructions, so by refactoring this function into its own, `RewriteAtomics` can share code rewriting those atomics with `AtomicExpand` while additionally saving a few cycles by generating the `cmpxchg` NaCl-specific intrinsic with the callback. Without this patch, `RewriteAtomics` would require two extra passes over functions, by first requiring use of the full `AtomicExpand` pass to just expand the leftover exotic RMWs and then running itself again to expand resulting `cmpxchg`s. NFC Reviewers: jfb Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D11422 llvm-svn: 243880
* [GlobalMerge] Allow targets to enable merging of extern variables, NFC.John Brawn2015-08-031-8/+15
| | | | | | | | | | Adjust the GlobalMergeOnExternal option so that the default behaviour is to do whatever the Target thinks is best. Explicitly enabled or disabling the option will override this default. Differential Revision: http://reviews.llvm.org/D10965 llvm-svn: 243873
* AsmPrinter: Split out non-DIE printing from DIE::print(), NFCDuncan P. N. Exon Smith2015-08-021-28/+24
| | | | | | | | Split out a helper `printValues()` for printing `DIEBlock` and `DIELoc`, instead of relying on `DIE::print()`. The shared code was actually fairly small there. No functionality change intended. llvm-svn: 243856
* AsmPrinter: Take DIEValueList in some DwarfUnit API, NFCDuncan P. N. Exon Smith2015-08-022-13/+17
| | | | | | | Take `DIEValueList` instead of `DIE` so that `DIEBlock` and `DIELoc` can stop inheriting from `DIE` in a future commit. llvm-svn: 243855
* AsmPrinter: Change DIEValueList to a subclass of DIE, NFCDuncan P. N. Exon Smith2015-08-021-6/+6
| | | | | | | | | | | | | | | Rewrite `DIEValueList` as a subclass of `DIE`, renaming its API to match `DIE`'s. This is preparation for changing `DIEBlock` and `DIELoc` to stop inheriting from `DIE` and inherit directly from `DIEValueList`. I thought about leaving this as a has-a relationship (and changing `DIELoc` and `DIEBlock` to also have-a `DIEValueList`), but that seemed to require a fair bit more boilerplate and I think it needed more changes to the `DwarfUnit` API than this will. No functionality change intended here. llvm-svn: 243854
* Remove trailing whitespace. NFCI.Simon Pilgrim2015-08-011-7/+7
| | | | llvm-svn: 243838
* Use SDValue bool check. NFCI.Simon Pilgrim2015-08-011-20/+10
| | | | llvm-svn: 243837
* [DAGCombiner] Convert constant AND masks to shuffle clear masks down to the ↵Simon Pilgrim2015-08-011-21/+64
| | | | | | | | | | | | | | byte level The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks. This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading. There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.) Differential Revision: http://reviews.llvm.org/D11518 llvm-svn: 243831
* MIR Parser: Report an error when a jump table entry is redefined.Alex Lorenz2015-07-311-2/+5
| | | | llvm-svn: 243798
* MIR Parser: Remove unused variable.Alex Lorenz2015-07-311-1/+0
| | | | | | This variable is unused as of r243572. llvm-svn: 243796
* MIR Serialization: Serialize the floating point immediate machine operands.Alex Lorenz2015-07-314-2/+80
| | | | | Reviewers: Duncan P. N. Exon Smith llvm-svn: 243780
* DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variableDuncan P. N. Exon Smith2015-07-314-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags, using `DW_TAG_variable` in their place Stop exposing the `tag:` field at all in the assembly format for `DILocalVariable`. Most of the testcase updates were generated by the following sed script: find test/ -name "*.ll" -o -name "*.mir" | xargs grep -l 'DILocalVariable' | xargs sed -i '' \ -e 's/tag: DW_TAG_arg_variable, //' \ -e 's/tag: DW_TAG_auto_variable, //' There were only a handful of tests in `test/Assembly` that I needed to update by hand. (Note: a follow-up could change `DILocalVariable::DILocalVariable()` to set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable` (as appropriate), instead of having that logic magically in the backend in `DbgVariable`. I've added a FIXME to that effect.) llvm-svn: 243774
* New EH representation for MSVC compatibilityDavid Majnemer2015-07-313-0/+36
| | | | | | | | | | This introduces new instructions neccessary to implement MSVC-compatible exception handling support. Most of the middle-end and none of the back-end haven't been audited or updated to take them into account. Differential Revision: http://reviews.llvm.org/D11097 llvm-svn: 243766
* [CodeGenPrepare] Compress a pair. No functional change.Benjamin Kramer2015-07-311-7/+3
| | | | llvm-svn: 243759
* [regalloc] Make RegMask clobbers prevent merging vreg's into PhysRegs when ↵Daniel Sanders2015-07-311-0/+8
| | | | | | | | | | | | | | | | | | | | | hoisting def's upwards. Summary: This prevents vreg260 and D7 from being merged in: %vreg260<def> = LDC1 ... JAL <ga:@sin>, <regmask ... list not containing D7 ...> %D7<def> = COPY %vreg260; ... Doing so is not valid because the JAL clobbers the D7. This fixes the almabench regression in the LLVM 3.7.0 release branch. Reviewers: MatzeB Subscribers: MatzeB, qcolombet, hans, llvm-commits Differential Revision: http://reviews.llvm.org/D11649 llvm-svn: 243745
* MIR Parser: Report an error when a constant pool item is redefined.Alex Lorenz2015-07-301-3/+6
| | | | llvm-svn: 243696
* MIR Parser: Report an error when a virtual register is redefined.Alex Lorenz2015-07-301-3/+5
| | | | llvm-svn: 243695
* fix memcpy/memset/memmove lowering when optimizing for sizeSanjay Patel2015-07-301-3/+15
| | | | | | | | | | | | | | | | | | | | Fixing MinSize attribute handling was discussed in D11363. This is a prerequisite patch to doing that. The handling of OptSize when lowering mem* functions was broken on Darwin because it wants to ignore -Os for these cases, but the existing logic also made it ignore -Oz (MinSize). The Linux change demonstrates a widespread problem. The backend doesn't usually recognize the MinSize attribute by itself; it assumes that if the MinSize attribute exists, then the OptSize attribute must also exist. Fixing this more generally will be a follow-on patch or two. Differential Revision: http://reviews.llvm.org/D11568 llvm-svn: 243693
* enable fast-math-flag propagation to DAG nodesSanjay Patel2015-07-301-1/+1
| | | | | | | | | | | This uncovered latent bugs previously: http://reviews.llvm.org/D10403 ...but it's time to try again because internal tests aren't finding more. If time passes and no other bugs are reported, we can remove this cl::opt. llvm-svn: 243687
* Add a TargetMachine hook that verifies DataLayout compatibilityMehdi Amini2015-07-301-0/+4
| | | | | | | | | | | | | Summary: Also provide the associated assertion when CodeGen starts. Reviewers: echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11654 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 243682
* MIR Serialization: Serialize the machine basic block's successor weights.Alex Lorenz2015-07-302-1/+26
| | | | | Reviewers: Duncan P. N. Exon Smith llvm-svn: 243659
* Reapply "Add reverse(ContainerTy) range adapter."Pete Cooper2015-07-291-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r243567, which ultimately reapplies r243563. The fix here was to use std::enable_if for overload resolution. Thanks to David Blaikie for lots of help on this, and for the extra tests! Original commit message follows: For cases where we needed a foreach loop in reverse over a container, we had to do something like for (const GlobalValue *GV : make_range(TypeInfos.rbegin(), TypeInfos.rend())) { This provides a convenience method which shortens this to for (const GlobalValue *GV : reverse(TypeInfos)) { There are 2 versions of this, with a preference to the rbegin() version. The first uses rbegin() and rend() to construct an iterator_range. The second constructs an iterator_range from the begin() and end() methods wrapped in std::reverse_iterator's. Reviewed by David Blaikie. llvm-svn: 243581
* MIR Serialization: Serialize the frame info's save and restore points.Alex Lorenz2015-07-292-8/+33
| | | | | | | This commit serializes the save and restore machine basic block references from the machine frame information class. llvm-svn: 243575
* MIR Parser: Extract the code that parses MBB references into a new method. NFC.Alex Lorenz2015-07-291-5/+18
| | | | | | | This commit extracts the code that's used by the class 'MIRParserImpl' to parse the machine basic block references into a new method named 'parseMBBReference'. llvm-svn: 243572
* Revert "Add reverse(ContainerTy) range adapter."Pete Cooper2015-07-291-1/+2
| | | | | | | | | This reverts commit r243563. The GCC buildbots were extremely unhappy about this. Reverting while we discuss a better way of doing overload resolution. llvm-svn: 243567
* Add reverse(ContainerTy) range adapter.Pete Cooper2015-07-291-2/+1
| | | | | | | | | | | | | | | | | | | | | | | For cases where we needed a foreach loop in reverse over a container, we had to do something like for (const GlobalValue *GV : make_range(TypeInfos.rbegin(), TypeInfos.rend())) { This provides a convenience method which shortens this to for (const GlobalValue *GV : reverse(TypeInfos)) { There are 2 versions of this, with a preference to the rbegin() version. The first uses rbegin() and rend() to construct an iterator_range. The second constructs an iterator_range from the begin() and end() methods wrapped in std::reverse_iterator's. Reviewed by David Blaikie. llvm-svn: 243563
* Roll forward r242871Jingyue Wu2015-07-291-18/+35
| | | | | | | r242871 missed one place that should be guarded with isPhysicalReg. This patch fixes that. llvm-svn: 243555
* MIR Serialization: Serialize the '.cfi_def_cfa' CFI instruction.Alex Lorenz2015-07-294-0/+18
| | | | llvm-svn: 243554
* MIR Parser: Parse multiple LHS register machine operands.Alex Lorenz2015-07-291-4/+7
| | | | llvm-svn: 243553
* move DAGCombiner's allowableAlignment() helper function into the TLISanjay Patel2015-07-293-64/+72
| | | | | | | | | | | | | | | | | | | | | | | Making allowableAlignment() more accessible was suggested as a predecessor patch for D10662, so I've pulled it into TargetLowering. This let's us remove 4 instances of duplicate logic in LegalizeDAG. There's a subtle functional change in the implementation: the existing allowableAlignment() code was using getPrefTypeAlignment() when checking alignment with the DataLayout and assumed that was fast. In this implementation, we use getABITypeAlignment() and assume that is fast. See the TODO comment or the discussion in the Phab review for future improvements in this implementation (don't use the data layout at all). There are no regression test changes from this difference, and I'm not sure how to expose it via a test. I think we actually do want to provide the 'Fast' param when checking this from DAGCombiner::MergeConsecutiveStores(). Ie, we shouldn't merge stores if the new stores are not going to be fast. But that change will require fixing allowsMisalignedMemoryAccess() overrides as noted in D10662. Differential Revision: http://reviews.llvm.org/D10905 llvm-svn: 243549
* Revert "[PeepholeOptimizer] Look through PHIs to find additional register ↵Bruno Cardoso Lopes2015-07-291-285/+82
| | | | | | | | | | sources" Reported to Broke some internal tests: PR24303 This reverts commit r243486. llvm-svn: 243540
* Reverting r243386 because it has serious post-commit concerns that have not ↵Aaron Ballman2015-07-291-0/+5
| | | | | | been addressed. Also reverts r243389, which relied on this commit. llvm-svn: 243527
* Temporarily revert r242871Jingyue Wu2015-07-291-28/+15
| | | | | | PR24299 llvm-svn: 243522
* [Statepoints] Let patchable statepoints have a symbolic call target.Sanjoy Das2015-07-281-1/+17
| | | | | | | | | | | | | | | | | | | | Summary: As added initially, statepoints required their call targets to be a constant pointer null if ``numPatchBytes`` was non-zero. This turns out to be a problem ergonomically, since there is no way to mark patchable statepoints as calling a (readable) symbolic value. This change remove the restriction of requiring ``null`` call targets for patchable statepoints, and changes PlaceSafepoints to maintain the symbolic call target through its transformation. Reviewers: reames, swaroop.sridhar Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11550 llvm-svn: 243502
* ignore duplicate divisor uses when transforming into reciprocal multiplies ↵Sanjay Patel2015-07-281-4/+4
| | | | | | | | | | | | | | | | | | | (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 llvm-svn: 243500
* fix TLI's combineRepeatedFPDivisors interface to return the minimum user ↵Sanjay Patel2015-07-281-4/+10
| | | | | | | | | | | | | | | threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498
* MIR Serialization: Serialize the target index machine operands.Alex Lorenz2015-07-284-0/+74
| | | | | Reviewers: Duncan P. N. Exon Smith llvm-svn: 243497
* [PeepholeOptimizer] Look through PHIs to find additional register sourcesBruno Cardoso Lopes2015-07-281-82/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply 243271 with more fixes; although we are not handling multiple sources with coalescable copies, we were not properly skipping this case. - Teaches the ValueTracker in the PeepholeOptimizer to look through PHI instructions. - Add findNextSourceAndRewritePHI method to lookup into multiple sources returnted by the ValueTracker and rewrite PHIs with new sources. With these changes we can find more register sources and rewrite more copies to allow coaslescing of bitcast instructions. Hence, we eliminate unnecessary VR64 <-> GR64 copies in x86, but it could be extended to other archs by marking "isBitcast" on target specific instructions. The x86 example follows: A: psllq %mm1, %mm0 movd %mm0, %r9 jmp C B: por %mm1, %mm0 movd %mm0, %r9 jmp C C: movd %r9, %mm0 pshufw $238, %mm0, %mm0 Becomes: A: psllq %mm1, %mm0 jmp C B: por %mm1, %mm0 jmp C C: pshufw $238, %mm0, %mm0 Differential Revision: http://reviews.llvm.org/D11197 rdar://problem/20404526 llvm-svn: 243486
* MIR Serialization: Serialize the block address machine operands.Alex Lorenz2015-07-284-4/+116
| | | | llvm-svn: 243453
* MIR Parser: Extract the method 'parseGlobalValue'. NFC.Alex Lorenz2015-07-281-9/+16
| | | | | | | | | This commit extracts the code that parses a global value from the method 'parseGlobalAddressOperand' into a new method 'parseGlobalValue', so that this code can be reused by the method which will parse the block address machine operands. llvm-svn: 243450
OpenPOWER on IntegriCloud