summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardwareSimon Pilgrim2016-03-222-0/+102
| | | | | | | | | | | | | | Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062
* Add "first class" lowering for deopt operand bundlesSanjoy Das2016-03-224-24/+99
| | | | | | | | | | | | | | | | | Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015
* [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext ↵Silviu Baranga2016-03-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935
* Suppress a -Wunused-variable warning in release builds.Craig Topper2016-03-201-0/+1
| | | | llvm-svn: 263892
* CodeGen: use range based for loopSaleem Abdulrasool2016-03-191-5/+4
| | | | | | Convert a loop to use a range based style loop. NFC. llvm-svn: 263884
* [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.Manman Ren2016-03-181-1/+1
| | | | | | | Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855
* MILexer: Add ErrorCallbackType typedef; NFCMatthias Braun2016-03-181-30/+22
| | | | llvm-svn: 263829
* [codeview] Only emit function ids for inlined functionsReid Kleckner2016-03-182-54/+76
| | | | | | | We aren't referencing any other kind of function currently. Should save a bit on our debug info size. llvm-svn: 263817
* DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual ↵Peter Collingbourne2016-03-171-4/+6
| | | | | | | | | | | | functions. A virtual index of -1u indicates that the subprogram's virtual index is unrepresentable (for example, when using the relative vtable ABI), so do not emit a DW_AT_vtable_elem_location attribute for it. Differential Revision: http://reviews.llvm.org/D18236 llvm-svn: 263765
* [SelectionDAG] Remove visitStatepoint; NFCSanjoy Das2016-03-173-11/+2
| | | | | | | This way we have a single entry point into StatepointLowering. The method was a direct dispatch to LowerStatepoint anyway. llvm-svn: 263682
* Fix indentation; NFCSanjoy Das2016-03-161-3/+2
| | | | llvm-svn: 263672
* Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFCSanjoy Das2016-03-162-144/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a step towards implementing "direct" lowering of calls and invokes with deopt operand bundles into STATEPOINT nodes (as opposed to having them mandatorily pass through RewriteStatepointsForGC, which is the case today). This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint` helper function that is able to lower a "statepoint like thing", and uses it to lower `gc.statepoint` calls. This is an NFC now, but in a later change we will use `LowerAsStatepoint` to directly lower calls and invokes with operand bundles without going through an intermediate `gc.statepoint` IR representation. FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add support for lowering non gc.statepoints, right now it is fairly tightly coupled with an IR level `gc.statepoint`. Reviewers: reames, pgavlin, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18106 llvm-svn: 263671
* Tweak some atomics functions in preparation for larger changes; NFC.James Y Knight2016-03-164-16/+18
| | | | | | | | | | | | | | | | - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665
* [SelectionDAG] Extract out populateCallLoweringInfo; NFCSanjoy Das2016-03-163-30/+31
| | | | | | | | | SelectionDAGBuilder::populateCallLoweringInfo is now used instead of SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo interface is more composable in face of design changes like http://reviews.llvm.org/D18106 llvm-svn: 263663
* Removed trailing whitespaceSimon Pilgrim2016-03-161-12/+12
| | | | llvm-svn: 263650
* [MachO] Add MachO alt-entry directive support.Lang Hames2016-03-151-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the MachO .alt_entry assembly directive, and uses it for global aliases with non-zero GEP offsets. The alt_entry flag indicates that a symbol should be layed out immediately after the preceding symbol. Conceptually it introduces an alternate entry point for a function or data structure. E.g.: safe_foo: // check preconditions for foo .alt_entry fast_foo fast_foo: // body of foo, can assume preconditions. The .alt_entry flag is also implicitly set on assembly aliases of the form: a = b + C where C is a non-zero constant, since these have the same effect as an alt_entry symbol: they introduce a label that cannot be moved relative to the preceding one. Setting the alt_entry flag on aliases of this form fixes http://llvm.org/PR25381. llvm-svn: 263521
* [StatepointLowering] Move an assertion; NFCISanjoy Das2016-03-151-6/+4
| | | | | | | | Instead of running an explicit loop over `gc.relocate` calls hanging off of a `gc.statepoint`, assert the validity of the type of the value being relocated in `visitRelocate`. llvm-svn: 263516
* Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND onEric Christopher2016-03-142-40/+0
| | | | | | | | | pre-SSE41 hardware" as it seems to be causing crashes during code generation in halide. PR forthcoming. This reverts commit r263303. llvm-svn: 263512
* Factor out MachineBlockPlacement::fillWorkLists. NFCAmaury Sechet2016-03-141-36/+39
| | | | | | | | | | | | Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18077 llvm-svn: 263495
* [SpillPlacement] Fix a quadratic behavior in spill placement.Quentin Colombet2016-03-142-53/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bad behavior happens when we have a function with a long linear chain of basic blocks, and have a live range spanning most of this chain, but with very few uses. Let say we have only 2 uses. The Hopfield network is only seeded with two active blocks where the uses are, and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two new nodes to the network due to the completely linear shape of the CFG. Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes, which adds up to a quadratic algorithm. This is an historical accident effect from r129188. When the Hopfield network is expanding, most of the action is happening on the frontier where new nodes are being added. The internal nodes in the network are not likely to be flip-flopping much, or they will at least settle down very quickly. This means that while `SpillPlacer->iterate()` is recomputing all the nodes in the network, it is probably only the two frontier nodes that are changing their output. Instead of recomputing the whole network on each iteration, we can maintain a SparseSet of nodes that need to be updated: - `SpillPlacement::activate()` adds the node to the todo list. - When a node changes value (i.e., `update()` returns true), its neighbors are added to the todo list. - `SpillPlacement::iterate()` only updates the nodes in the list. The result of Hopfield iterations is not necessarily exact. It should converge to a local minimum, but there is no guarantee that it will find a global minimum. It is possible that updating nodes in a different order will cause us to switch to a different local minimum. In other words, this is not NFC, but although I saw a few runtime improvements and regressions when I benchmarked this change, those were side effects and actually the performance change is in the noise as expected. Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks, guidance and time for the review. llvm-svn: 263460
* [DAG] use !isUndef() ; NFCISanjay Patel2016-03-144-13/+11
| | | | llvm-svn: 263453
* [DAG] use isUndef() ; NFCISanjay Patel2016-03-145-98/+94
| | | | llvm-svn: 263448
* Revert "Recommitted r261633 "Supporting all entities declared in lexical ↵Benjamin Kramer2016-03-146-199/+51
| | | | | | | | scope in LLVM debug info." After fixing PR26715 at r263379." This reverts commit r263424. Breaks self-host. llvm-svn: 263437
* Recommitted r261633 "Supporting all entities declared in lexical scope in ↵Amjad Aboud2016-03-146-51/+199
| | | | | | | | LLVM debug info." After fixing PR26715 at r263379. llvm-svn: 263424
* [CodeView] Consistently handle overly large symbol namesDavid Majnemer2016-03-141-15/+17
| | | | | | | Overly large symbol names weren't correctly handled for leaf function records. llvm-svn: 263408
* [CodeView] Truncate display namesDavid Majnemer2016-03-131-5/+8
| | | | | | | | | | | | | | | | | | | Fundamentally, the length of a variable or function name is bound by the maximum size of a record: 0xffff. However, the name doesn't live in a vacuum; other data is associated with the name, lowering the bound further. We would naively attempt to emit the name, causing us to assert because the record would no-longer fit in 16-bits. Instead, truncate the name but preserve as much as we can. While I have tested this locally, I've decided to not commit it due to the test's size. N.B. While this behavior is undesirable, it is better than MSVC's behavior. They seem to truncate to ~4000 characters. llvm-svn: 263378
* Make gc relocates more strongly typed; NFCSanjoy Das2016-03-121-10/+13
| | | | | | | Don't use a `Value *` where we can use a stronger `GCRelocateInst *` type. llvm-svn: 263327
* [X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardwareSimon Pilgrim2016-03-112-0/+40
| | | | | | | | | | | | Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 263303
* [IRTranslator] Translate unconditional branches.Quentin Colombet2016-03-112-0/+26
| | | | llvm-svn: 263265
* [MachineIRBuilder] Rework buildInstr API to maximize code reuse.Quentin Colombet2016-03-111-22/+24
| | | | llvm-svn: 263264
* [IRTranslator] Update getOrCreateVReg API to use references.Quentin Colombet2016-03-111-10/+10
| | | | | | | A value that we want to keep in a virtual register cannot be null. Reflect that in the API. llvm-svn: 263263
* [MachineIRBuilder] Rename the setter of MF for consistency with the getter.Quentin Colombet2016-03-112-2/+2
| | | | llvm-svn: 263262
* [MachineIRBuilder] Rename the setter for MBB for consistency with the getter.Quentin Colombet2016-03-112-4/+6
| | | | llvm-svn: 263261
* [IRTranslator] Update getOrCreateBB API to use references.Quentin Colombet2016-03-111-4/+4
| | | | | | A null basic block is invalid, so just pass a reference. llvm-svn: 263260
* [misched] Fix a truncation issue from r263021.Chad Rosier2016-03-111-1/+1
| | | | | | | | | The truncation was causing the sorting algorithm to behave oddly when comparing positive and negative offsets. Fortunately, this doesn't currently happen in practice and was exposed by a WIP. Thus, I can't test this change now, but the follow on patch will. llvm-svn: 263255
* Minor code cleanups. NFC.Junmo Park2016-03-111-3/+3
| | | | llvm-svn: 263200
* Minor code cleanup. NFC.Junmo Park2016-03-111-1/+1
| | | | llvm-svn: 263196
* Remove llvm::getDISubprogram in favor of Function::getSubprogramPete Cooper2016-03-111-2/+2
| | | | | | | | | | | | | | | | | llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself. Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function. Ideally this should be NFC, but in reality its possible that a function: has no !dbg (in which case there's likely a bug somewhere in an opt pass), or that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will Reviewed by Duncan Exon Smith. Differential Revision: http://reviews.llvm.org/D18074 llvm-svn: 263184
* Test commit accessMarianne Mailhot-Sarrasin2016-03-101-1/+1
| | | | llvm-svn: 263165
* [X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ↵Simon Pilgrim2016-03-102-2/+19
| | | | | | | | | | | | ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion). Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 263159
* [PM] Port memdep to the new pass manager.Chandler Carruth2016-03-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082
* [CGP] Duplicate addressing computation in cold paths if required to sink ↵Philip Reames2016-03-091-8/+45
| | | | | | | | | | | | | | addressing mode This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store. In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path. This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had. Differential Revision: http://reviews.llvm.org/D17652 llvm-svn: 263074
* SelectionDAG: Fix a crash on inline asm when output register supports ↵Tom Stellard2016-03-091-3/+7
| | | | | | | | | | | | | | | | multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022
* [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.Chad Rosier2016-03-093-8/+9
| | | | | | http://reviews.llvm.org/D17967 llvm-svn: 263021
* Invoke DAG postprocessing in the post-RA schedulerKrzysztof Parzyszek2016-03-081-0/+2
| | | | | | | This was inadvertently omitted from r262774, which added the mutation interface. llvm-svn: 262939
* Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ↵Hans Wennborg2016-03-081-18/+1
| | | | | | | | ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935
* Add DAG mutation interface to the DFA packetizerKrzysztof Parzyszek2016-03-081-0/+24
| | | | llvm-svn: 262930
* Re-apply "SelectionDAG: Store SDNode operands in an ArrayRecycler"Justin Bogner2016-03-081-143/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-applies r262886 with a fix for 32 bit platforms that have 8 byte pointer alignment, effectively reverting r262892. Original Message: Currently some SDNode operands are malloc'd, some are stored inline in subclasses of SDNode, and some are thrown into a BumpPtrAllocator. This scheme is complex, inconsistent, and makes refactoring SDNodes fairly difficult. Instead, we can allocate all of the operands using an ArrayRecycler that wraps a BumpPtrAllocator. This keeps the cache locality when iterating operands, improves locality when iterating SDNodes without looking at operands, and vastly simplifies the ownership semantics. It also means we stop overallocating SDNodes by 2-3x and will make it simpler to fix the rampant undefined behaviour we have in how we mutate SDNodes from one kind to another (See llvm.org/pr26808). This is NFC other than the changes in memory behaviour, and I ran some LNT tests to make sure this didn't hurt compile time. Not many tests changed: there were a couple of 1-2% regressions reported, but there were more improvements (of up to 4%) than regressions. llvm-svn: 262902
* [MIR] Change the token name for '<' and '>' to be consitent with the LLVM IR ↵Quentin Colombet2016-03-082-4/+4
| | | | | | | | parser. Thanks to Ahmed Bougacha for noticing! llvm-svn: 262899
* [GlobalISel] Introduce initializer method to support start/stop-after features.Quentin Colombet2016-03-084-25/+33
| | | | llvm-svn: 262896
OpenPOWER on IntegriCloud