summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* GlobalISel: add a G_PHI instruction to give phis a type.Tim Northover2016-09-012-2/+4
| | | | | | | They're another source of generic vregs, which are going to need a type on the definition when we remove the register width from MachineRegisterInfo. llvm-svn: 280412
* Rename some variables to have meaningful names. NFC.Michael Kuperstein2016-09-011-29/+29
| | | | llvm-svn: 280391
* [DAGCombine] Don't fold a trunc if it feeds an anyextMichael Kuperstein2016-09-011-0/+4
| | | | | | | | | | | | | | | Legalization tends to create anyext(trunc) patterns. This should always be combined - into either a single trunc, a single ext, or nothing if the types match exactly. But if we happen to combine the trunc first, we may pull the trunc away from the anyext or make it implicit (e.g. the truncate(extract) -> extract(bitcast) fold). To prevent this, we can avoid doing the fold, similarly to how we already handle fpround(fpextend). Differential Revision: https://reviews.llvm.org/D23893 llvm-svn: 280386
* Add an optional parameter with a list of undefs to extendToIndicesKrzysztof Parzyszek2016-09-011-2/+3
| | | | | | Reapply r280268, hopefully in a version that MSVC likes. llvm-svn: 280358
* Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPCHal Finkel2016-09-013-12/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is lowered using: ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET) where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86, FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not work for PowerPC. Because of the way that the stack layout works, the canonical frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset as well), so it is not just a matter of implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics -- We can do that, since it is currently used only for @llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips currently does this, but by using a custom lowering for ADD that specifically recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern. This change introduces a ISD::EH_DWARF_CFA node, which by default expands using the existing logic, but can be directly lowered by the target. Mips is updated to use this method (which simplifies its implementation, and I suspect makes it more robust), and updates PowerPC to do the same. Fixes PR26761. Differential Revision: https://reviews.llvm.org/D24038 llvm-svn: 280350
* Add a counter-function insertion passHal Finkel2016-09-014-0/+67
| | | | | | | | | | | | | | | | | | As discussed in https://reviews.llvm.org/D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong. Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so we're passing it here as a function attribute. Clang will be updated to use this mechanism. Differential Revision: https://reviews.llvm.org/D22825 llvm-svn: 280347
* [XRay] Detect and emit sleds for sibling/tail callsDean Michael Berris2016-09-011-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change promotes the 'isTailCall(...)' member function to TargetInstrInfo as a query interface for determining on a per-target basis whether a given MachineInstr is a tail call instruction. We build upon this in the XRay instrumentation pass to emit special sleds for tail call optimisations, where we emit the correct kind of sled. The tail call sleds look like a mix between the function entry and function exit sleds. Form-wise, the sled comes before the "jmp" instruction that implements the tail call similar to how we do it for the function entry sled. Functionally, because we know this is a tail call, it behaves much like an exit sled -- i.e. at runtime we may use the exit trampolines instead of a different kind of trampoline. A follow-up change to recognise these sleds will be done in compiler-rt, so that we can start intercepting these initially as exits, but also have the option to have different log entries to more accurately reflect that this is actually a tail call. Reviewers: echristo, rSerge, majnemer Subscribers: mehdi_amini, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D23986 llvm-svn: 280334
* Revert "Add an optional parameter with a list of undefs to extendToIndices"Reid Kleckner2016-08-311-3/+2
| | | | | | | | | | | | This reverts commit r280268, it causes all MSVC 2013 to ICE. This appears to have been fixed in a later MSVC 2013 update, because I cannot reproduce it locally. That said, all upstream LLVM bots are broken right now, so I am reverting. Also reverts dependent change r280275, "[Hexagon] Deal with undefs when extending live intervals". llvm-svn: 280301
* GlobalISel: use G_TYPE to annotate physregs with a type.Tim Northover2016-08-316-6/+17
| | | | | | | | | | More preparation for dropping source types from MachineInstrs: regsters coming out of already-selected code (i.e. non-generic instructions) don't have a type, but that information is needed so we must add it manually. This is done via a new G_TYPE instruction. llvm-svn: 280292
* [TargetPassConfig] Add a hook to tell whether GlobalISel should warm on ↵Quentin Colombet2016-08-312-5/+12
| | | | | | | | | fallback. Thanks to this patch, we know have a way to easly see if GlobalISel failed. llvm-svn: 280273
* [ResetMachineFunction] Emit the diagnostic isel fallback when asked.Quentin Colombet2016-08-311-5/+14
| | | | | | This pass is now able to report when the function is being reset. llvm-svn: 280272
* Add an optional parameter with a list of undefs to extendToIndicesKrzysztof Parzyszek2016-08-311-2/+3
| | | | llvm-svn: 280268
* s/static inline/static/ for headers I have changed in r279475. NFC.Tim Shen2016-08-311-1/+1
| | | | llvm-svn: 280257
* [codeview] Emit vtable shape informationReid Kleckner2016-08-312-2/+24
| | | | | | | | | | | | | The shape of the vtable is passed down as the size of the __vtbl_ptr_type. This special pointer type appears both as the pointee type of the vptr type, and by itself in every dynamic class. For classes with multiple vtables, only the shape of the primary vftable is included, as the shape of all secondary vftables will be the same as in the base class. Fixes PR28150 llvm-svn: 280254
* [statepoints][experimental] Add support for live-in semantics of values in ↵Philip Reames2016-08-312-7/+44
| | | | | | | | | | | | | | | | deopt bundles This is a first step towards supporting deopt value lowering and reporting entirely with the register allocator. I hope to build on this in the near future to support live-on-return semantics, but I have a use case which allows me to test and investigate code quality with just the live-in semantics so I've chosen to start there. For those curious, my use cases is our implementation of the "__llvm_deoptimize" function we bind to @llvm.deoptimize. I'm choosing not to hard code that fact in the patch and instead make it configurable via function attributes. The basic approach here is modelled on what is done for the "Live In" values on stackmaps and patchpoints. (A secondary goal here is to remove one of the last barriers to merging the pseudo instructions.) We start by adding the operands directly to the STATEPOINT SDNode. Once we've lowered to MI, we extend the remat logic used by the register allocator to fold virtual register uses into StackMap::Indirect entries as needed. This does rely on the fact that the register allocator rematerializes. If it didn't along some code path, we could end up with more vregs than physical registers and fail to allocate. Today, we *only* fold in the register allocator. This can create some weird effects when combined with arguments passed on the stack because we don't fold them appropriately. I have an idea how to fix that, but it needs this patch in place to work on that effectively. (There's some weird interaction with the scheduler as well, more investigation needed.) My near term plan is to land this patch off-by-default, experiment in my local tree to identify any correctness issues and then start fixing codegen problems one by one as I find them. Once I have the live-in lowering fully working (both correctness and code quality), I'm hoping to move on to the live-on-return semantics. Note: I don't have any *known* miscompiles with this patch enabled, but I'm pretty sure I'll find at least a couple. Thus, the "experimental" tag and the fact it's off by default. Differential Revision: https://reviews.llvm.org/D24000 llvm-svn: 280250
* Fixed spill stack objects are mutableKrzysztof Parzyszek2016-08-311-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D24039 llvm-svn: 280244
* [XRay] Support multiple return instructions in a single basic blockDean Michael Berris2016-08-311-1/+0
| | | | | | | Add a .mir test to catch this case, and fix the xray-instrumentation pass to handle it appropriately. llvm-svn: 280192
* [codeview] Remove redundant TypeTable lookupReid Kleckner2016-08-301-17/+1
| | | | | | | | As written, the code should assert if this lookup would have ever succeeded. Without looking through composite types, the type graph should be acyclic. llvm-svn: 280168
* GlobalISel: combine extracts & sequences created for legalizationTim Northover2016-08-302-0/+83
| | | | | | | | Legalization ends up creating many G_SEQUENCE/G_EXTRACT pairs which leads to inefficient codegen (even for -O0), so add a quick pass over the function to remove them again. llvm-svn: 280155
* CodeGen: Fixup for r280128, since GCC isn't as permissive as ClangDuncan P. N. Exon Smith2016-08-301-5/+3
| | | | | | | Fixes the bots, e.g.: http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/10055 llvm-svn: 280135
* GlobalISel: forbid physical registers on generic MIs.Tim Northover2016-08-301-0/+8
| | | | | | | | | | We're intending to move to a world where the type of a register is determined by its (unique) def. This is incompatible with physregs, which are untyped. It also means the other passes don't have to worry quite so much about register-class compatibility and inserting COPYs appropriately. llvm-svn: 280132
* ADT: Split ilist_node_traits into alloc and callback, NFCDuncan P. N. Exon Smith2016-08-302-8/+11
| | | | | | | | | | | | | | | | Many lists want to override only allocation semantics, or callbacks for iplist. Split these up to prevent code duplication. - Specialize ilist_alloc_traits to change the implementations of deleteNode() and createNode(). - One common desire is to do nothing deleteNode() and disable createNode(). Specialize ilist_alloc_traits to inherit from ilist_noalloc_traits for that behaviour. - Specialize ilist_callback_traits to use the addNodeToList(), removeNodeFromList(), and transferNodesFromList() callbacks. As a drive-by, add some coverage to the callback-related unit tests. llvm-svn: 280128
* TailDuplication: Extract Indirect-Branch block limit as option. NFCKyle Butt2016-08-301-3/+9
| | | | | | | | The existing code hard-coded a limit of 20 instructions for duplication when a block ended with an indirect branch. Extract this as an option. No functional change intended. llvm-svn: 280125
* ADT: Guarantee transferNodesFromList is only called on transfersDuncan P. N. Exon Smith2016-08-301-3/+2
| | | | | | | | | | | | Guarantee that ilist_traits<T>::transferNodesFromList is only called when nodes are actually changing lists. I also moved all the callbacks to occur *first*, before the operation. This is the only choice for iplist<T>::merge, so we might as well be consistent. I expect this to have no effect in practice, although it simplifies the logic in both iplist<T>::transfer and iplist<T>::insert. llvm-svn: 280122
* Fix coding style; NFCSanjoy Das2016-08-301-4/+2
| | | | | | Avoid variables starting with lowercase. llvm-svn: 280048
* ADT: Give ilist<T>::reverse_iterator a handle to the current nodeDuncan P. N. Exon Smith2016-08-301-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reverse iterators to doubly-linked lists can be simpler (and cheaper) than std::reverse_iterator. Make it so. In particular, change ilist<T>::reverse_iterator so that it is *never* invalidated unless the node it references is deleted. This matches the guarantees of ilist<T>::iterator. (Note: MachineBasicBlock::iterator is *not* an ilist iterator, but a MachineInstrBundleIterator<MachineInstr>. This commit does not change MachineBasicBlock::reverse_iterator, but it does update MachineBasicBlock::reverse_instr_iterator. See note at end of commit message for details on bundle iterators.) Given the list (with the Sentinel showing twice for simplicity): [Sentinel] <-> A <-> B <-> [Sentinel] the following is now true: 1. begin() represents A. 2. begin() holds the pointer for A. 3. end() represents [Sentinel]. 4. end() holds the poitner for [Sentinel]. 5. rbegin() represents B. 6. rbegin() holds the pointer for B. 7. rend() represents [Sentinel]. 8. rend() holds the pointer for [Sentinel]. The changes are #6 and #8. Here are some properties from the old scheme (which used std::reverse_iterator): - rbegin() held the pointer for [Sentinel] and rend() held the pointer for A; - operator*() cost two dereferences instead of one; - converting from a valid iterator to its valid reverse_iterator involved a confusing increment; and - "RI++->erase()" left RI invalid. The unintuitive replacement was "RI->erase(), RE = end()". With vector-like data structures these properties are hard to avoid (since past-the-beginning is not a valid pointer), and don't impose a real cost (since there's still only one dereference, and all iterators are invalidated on erase). But with lists, this was a poor design. Specifically, the following code (which obviously works with normal iterators) now works with ilist::reverse_iterator as well: for (auto RI = L.rbegin(), RE = L.rend(); RI != RE;) fooThatMightRemoveArgFromList(*RI++); Converting between iterator and reverse_iterator for the same node uses the getReverse() function. reverse_iterator iterator::getReverse(); iterator reverse_iterator::getReverse(); Why doesn't iterator <=> reverse_iterator conversion use constructors? In order to catch and update old code, reverse_iterator does not even have an explicit conversion from iterator. It wouldn't be safe because there would be no reasonable way to catch all the bugs from the changed semantic (see the changes at call sites that are part of this patch). Old code used this API: std::reverse_iterator::reverse_iterator(iterator); iterator std::reverse_iterator::base(); Here's how to update from old code to new (that incorporates the semantic change), assuming I is an ilist<>::iterator and RI is an ilist<>::reverse_iterator: [Old] ==> [New] reverse_iterator(I) (--I).getReverse() reverse_iterator(I) ++I.getReverse() --reverse_iterator(I) I.getReverse() reverse_iterator(++I) I.getReverse() RI.base() (--RI).getReverse() RI.base() ++RI.getReverse() --RI.base() RI.getReverse() (++RI).base() RI.getReverse() delete &*RI, RE = end() delete &*RI++ RI->erase(), RE = end() RI++->erase() ======================================= Note: bundle iterators are out of scope ======================================= MachineBasicBlock::iterator, also known as MachineInstrBundleIterator<MachineInstr>, is a wrapper to represent MachineInstr bundles. The idea is that each operator++ takes you to the beginning of the next bundle. Implementing a sane reverse iterator for this is harder than ilist. Here are the options: - Use std::reverse_iterator<MBB::i>. Store a handle to the beginning of the next bundle. A call to operator*() runs a loop (usually operator--() will be called 1 time, for unbundled instructions). Increment/decrement just works. This is the status quo. - Store a handle to the final node in the bundle. A call to operator*() still runs a loop, but it iterates one time fewer (usually operator--() will be called 0 times, for unbundled instructions). Increment/decrement just works. - Make the ilist_sentinel<MachineInstr> *always* store that it's the sentinel (instead of just in asserts mode). Then the bundle iterator can sniff the sentinel bit in operator++(). I initially tried implementing the end() option as part of this commit, but updating iterator/reverse_iterator conversion call sites was error-prone. I have a WIP series of patches that implements the final option. llvm-svn: 280032
* GlobalISel: use multi-dimensional arrays for legalize actions.Tim Northover2016-08-291-12/+16
| | | | | | | | | | | Instead of putting all possible requests into a single table, we can perform the extremely dense lookup based on opcode and type-index in constant time using multi-dimensional array-like things. This roughly halves the time spent doing legalization, which was dominated by queries against the Actions table. llvm-svn: 280011
* Propagate TBAA info in SelectionDAG::getIndexedLoadKrzysztof Parzyszek2016-08-291-1/+2
| | | | | | Patch by Pranav Bhandarkar. llvm-svn: 279998
* GlobalISel: switch to SmallVector for pending legalizations.Tim Northover2016-08-291-6/+8
| | | | | | std::queue was doing far to many heap allocations to be healthy. llvm-svn: 279992
* GlobalISel: legalize frem to a libcall on AArch64.Tim Northover2016-08-292-0/+29
| | | | llvm-svn: 279988
* GlobalISel: rework CallLowering so that it can be used for libcalls too.Tim Northover2016-08-293-3/+44
| | | | | | | There should be no functional change here, I'm just making the implementation of "frem" (to libcall) legalization easier for a followup. llvm-svn: 279987
* IfConversion: Fix branch predication bug.Kyle Butt2016-08-291-20/+61
| | | | | | | | | | | | This bug shows up with diamonds that share unpredicable, unanalyzable branches. There's an included test case from Hexagon. What was happening was that we were attempting to predicate the branch instruction despite the fact that it was checked to be the same. Now for unanalyzable branches we skip over the branch instructions when predicating the block. Differential Revision: https://reviews.llvm.org/D23939 llvm-svn: 279985
* [TargetLowering] remove fdiv and frem from canOpTrap() (PR29114)Sanjay Patel2016-08-291-4/+0
| | | | | | | | | | | | | | | | | Assuming the default FP env, we should not treat fdiv and frem any differently in terms of trapping behavior than any other FP op. Ie, FP ops do not trap with the default FP env. This matches how we treat these ops in IR with isSafeToSpeculativelyExecute(). There's a similar bug in Constant::canTrap(). This bug manifests in PR29114: https://llvm.org/bugs/show_bug.cgi?id=29114 ...as a sequence of scalar divisions instead of a vector division on x86 for a <3 x float> type. Differential Revision: https://reviews.llvm.org/D23974 llvm-svn: 279970
* Do not use MRI::getMaxLaneMaskForVReg as a mask covering whole registerKrzysztof Parzyszek2016-08-292-7/+5
| | | | | | | | | | | | | MRI::getMaxLaneMaskForVReg does not always cover the whole register. For example, on X86 the upper 16 bits of EAX cannot be accessed via any subregister. Consequently, there is no lane mask that only covers that part of EAX. The getMaxLaneMaskForVReg will return the union of the lane masks for all subregisters, and in case of EAX, that union will not cover the upper 16 bits. This fixes https://llvm.org/bugs/show_bug.cgi?id=29132 llvm-svn: 279969
* Use the correct ctor/dtor section for dynamic-no-pic.Rafael Espindola2016-08-291-1/+1
| | | | llvm-svn: 279967
* Move code only used by codegen out of MC. NFC.Rafael Espindola2016-08-291-5/+55
| | | | | | MC itself never needs to know about these sections. llvm-svn: 279965
* Fixed a bug in type legalizer for masked gather.Igor Breger2016-08-291-1/+9
| | | | | | | | | The problem occurs when the Node doesn't updated in place , UpdateNodeOperation() return the node that already exist. In this case assert fail in PromoteIntegerOperand() , N have 2 results ( val + chain). Differential Revision: http://reviews.llvm.org/D23756 llvm-svn: 279961
* [InstructionSelect] NumBlocks isn't defined in DEBUG build.Haojian Wu2016-08-291-1/+1
| | | | | | | | | | | | Summary: A follow-up fixing on http://llvm.org/viewvc/llvm-project?view=revision&revision=279905. Reviewers: bkramer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23985 llvm-svn: 279959
* [RegBankSelect] Do not abort when the target wants to fall back.Quentin Colombet2016-08-271-17/+48
| | | | llvm-svn: 279906
* [InstructionSelect] Do not abort when the target wants to fall back.Quentin Colombet2016-08-271-7/+28
| | | | llvm-svn: 279905
* [MachineLegalize] Do not abort when the target wants to fall back.Quentin Colombet2016-08-272-6/+26
| | | | llvm-svn: 279904
* [GlobalISel] Add a fallback path to SDISel.Quentin Colombet2016-08-273-0/+63
| | | | | | | | | When global-isel fails on a MachineFunction MF, MF will be cleaned up and given to SDISel. Thanks to this fallback, we can already perform correctness test even if we support only a small portion of the functions in a test. llvm-svn: 279891
* [GlobalISel] Teach the core pipeline not to run if ISel failed.Quentin Colombet2016-08-273-0/+14
| | | | llvm-svn: 279889
* [IRTranslator] Do not abort when the target wants to fall back.Quentin Colombet2016-08-261-5/+52
| | | | | | Every pass in the GlobalISel pipeline will need to do something similar. llvm-svn: 279886
* [MFProperties] Introduce a FailedISel property.Quentin Colombet2016-08-261-0/+1
| | | | | | | | | | | This is used to communicate that the instruction selection pipeline failed at some point. Another way to achieve that would be to have some kind of conditional scheduling in the PassManager, such that we only schedule a pass based on the success/failure of another one. The property approach has the advantage of being lightweight and solve the problem at stake. llvm-svn: 279885
* [TargetPassConfig] Add a target hook to know what GlobalISel should do on error.Quentin Colombet2016-08-261-0/+13
| | | | | | | | | By default, this hook tells GlobalISel to abort (report a fatal error) when it encounters an error. The alternative will be to fall back on SDISel. This fall back will be removed when the bring-up of GlobalISel is over. llvm-svn: 279879
* [IRTranslator][NFC] Use DEBUG_TYPE instead of repeating the name.Quentin Colombet2016-08-261-1/+1
| | | | llvm-svn: 279878
* [SelectionDAG] Do not run the ISel process on already selected code.Quentin Colombet2016-08-261-0/+4
| | | | | | | Right now, this cannot happen, but with the fall back path of GlobalISel it will show up eventually. llvm-svn: 279877
* [MachineFunction] Introduce a reset method.Quentin Colombet2016-08-261-5/+14
| | | | | | | | | | This method allows to reset the state of a MachineFunction as if it was just created. This will be used during the bring-up of GlobalISel to provide a way to fallback on SelectionDAG. That way, we can start doing correctness testing even if we are not able to select all functions via the global instruction selector. llvm-svn: 279876
* [MFProperties] Introduce a reset method with no argument.Quentin Colombet2016-08-261-1/+1
| | | | | | This method allows to reset all the properties in one go. llvm-svn: 279874
OpenPOWER on IntegriCloud