summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* DwarfAccelTable: fix obvious typo.Frederic Riss2015-03-091-1/+1
| | | | | | | | | | | | I have a test for that issue, but I didn't include it in the commit as it's a 200KB file for a pretty minor issue. (The reason the file is so big is that it needs > 1024 variables/functions to trigger and that with debug information. The issue/fix on the other side is totally trivial. If poeple want the test commited, I can do that. It just didn't seem worth it to me. llvm-svn: 231701
* Don't prime the section map.Rafael Espindola2015-03-091-3/+0
| | | | | | | This was just creating unused labels for .text when the module had no functions. llvm-svn: 231694
* Print jump tables before exception tables.Rafael Espindola2015-03-095-32/+49
| | | | | | | | | | | In the case where just tables are part of the function section, this produces more readable assembly by avoiding switching to the eh section and back to .text. This would also break with non unique section names, as trying to switch to a unique section actually creates a new one. llvm-svn: 231677
* Don't repeat name in comment. NFC.Rafael Espindola2015-03-091-18/+12
| | | | llvm-svn: 231676
* Remove dummy method implementations.Rafael Espindola2015-03-092-26/+0
| | | | | | | These are pure virtual in the base class, so the compiler checks that they are implemented. llvm-svn: 231673
* Simplify expressions involving boolean constants with clang-tidyDavid Blaikie2015-03-093-4/+3
| | | | | | | | Patch by Richard (legalize at xmission dot com). Differential Revision: http://reviews.llvm.org/D8154 llvm-svn: 231617
* Make static variables const if possible. Makes them go into a read-only section.Benjamin Kramer2015-03-082-3/+4
| | | | | | Or fold them into a initializer list which has the same effect. NFC. llvm-svn: 231598
* [DAGCombiner] Add a shuffle mask commutation helper function. NFCI.Simon Pilgrim2015-03-072-39/+5
| | | | | | | | | | We have an increasing number of cases where we are creating commuted shuffle masks - all implementing nearly the same code. This patch adds a static helper function - ShuffleVectorSDNode::commuteMask() and replaces a number of cases to use it. Differential Revision: http://reviews.llvm.org/D8139 llvm-svn: 231581
* Make constant arrays that are passed to functions as const.Benjamin Kramer2015-03-071-1/+1
| | | | | | | | In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571
* Use SDValue bool check to tidyup some possible combines. NFC.Simon Pilgrim2015-03-071-6/+5
| | | | llvm-svn: 231569
* [DAGCombiner] Fix wrong folding of AND dag nodes.Andrea Di Biagio2015-03-071-3/+7
| | | | | | | | | | | | | | | | | | | | | | | This patch fixes the logic in the DAGCombiner that folds an AND node according to rule: (and (X (load V)), C) -> (X (load V)) An AND between a vector load 'X' and a constant build_vector 'C' can be folded into the load itself only if we can prove that the AND operation is redundant. The algorithm implemented by 'visitAND' firstly computes the splat value 'S' from C, and then checks if S has the lower 'B' bits set (where B is the size in bits of the vector element type). The algorithm takes into account also the 'undef' bits in the splat mask. Unfortunately, the algorithm only worked under the assumption that the size of S is a multiple of the vector element type. With this patch, we conservatively avoid folding the AND if the splat bits are not compatible with the vector element type. Added X86 test and-load-fold.ll Differential Revision: http://reviews.llvm.org/D8085 llvm-svn: 231563
* [DAGCombiner] SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(V,C)) -> VECTOR_SHUFFLESimon Pilgrim2015-03-071-0/+30
| | | | | | | | | | | | This patch attempts to convert a SCALAR_TO_VECTOR using an operand from an EXTRACT_VECTOR_ELT into a VECTOR_SHUFFLE. This prevents many cases of spilling scalar data between the gpr + simd registers. At present the optimization only accepts cases where there is no TRUNC of the scalar type (i.e. all types must match). Differential Revision: http://reviews.llvm.org/D8132 llvm-svn: 231554
* DAGCombiner: Canonicalize select(and/or,x,y) depending on target.Matthias Braun2015-03-061-0/+63
| | | | | | | | | | | | | | | This is based on the following equivalences: select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y) select(C0 | C1, X, Y) <=> select(C0, X, select(C1, X, Y)) Many target cannot perform and/or on the CPU flags and therefore the right side should be choosen to avoid materializign the i1 flags in an integer register. If the target can perform this operation efficiently we normalize to the left form. Differential Revision: http://reviews.llvm.org/D7622 llvm-svn: 231507
* DAGCombiner: Factor out some and/or combines.Matthias Braun2015-03-061-225/+252
| | | | | | | | | | | | | | | This is in preparation for changing visitSELECT to normalize towards select(Cond0, select(Cond1, X, Y), Y); select(Cond0, X, select(Cond1, X, Y)) which perfom an implicit and/or of the conditions. The factored function contains all DAGCombine rules which reduce two values combined by an And/Or operation to a single value. This does not include rules involving constants as visitSELECT already handles that case. Differential Revision: http://reviews.llvm.org/D8026 llvm-svn: 231506
* ExecutionDepsFix: Indizes -> Indices.Matthias Braun2015-03-061-10/+10
| | | | | | Translate german to english. llvm-svn: 231500
* Fix typo.Eric Christopher2015-03-061-1/+1
| | | | llvm-svn: 231495
* [AsmPrinter][TLOF] 32-bit MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-062-4/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | Add MachO 32-bit (i.e. arm and x86) support for replacing global GOT equivalent symbol accesses. Unlike 64-bit targets, there's no GOTPCREL relocation, and access through a non_lazy_symbol_pointers section is used instead. -- before _extgotequiv: .long _extfoo _delta: .long _extgotequiv-_delta -- after _delta: .long L_extfoo$non_lazy_ptr-_delta .section __IMPORT,__pointers,non_lazy_symbol_pointers L_extfoo$non_lazy_ptr: .indirect_symbol _extfoo .long 0 llvm-svn: 231475
* [AsmPrinter][TLOF] ARM64 MachO support for replacing GOT equivalentsBruno Cardoso Lopes2015-03-061-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow up r230264 and add ARM64 support for replacing global GOT equivalent symbol accesses by references to the GOT entry for the final symbol instead, example: -- before .globl _foo _foo: .long 42 .globl _gotequivalent _gotequivalent: .quad _foo .globl _delta _delta: .long _gotequivalent-_delta -- after .globl _foo _foo: .long 42 .globl _delta Ltmp3: .long _foo@GOT-Ltmp3 llvm-svn: 231474
* LegalizeTypes: Handle shift by 0 in ExpandShiftByConstant.Michael Zolotukhin2015-03-061-1/+8
| | | | | | | Though such shifts are usually optimized away by combiner, we still can encounter them after a vector shift is legalized. llvm-svn: 231443
* SelectionDAGBuilder: Merge 3 copies of the limited precision exp2 emission code.Benjamin Kramer2015-03-051-251/+93
| | | | | | NFC intended. llvm-svn: 231406
* Fix uninitialized memory references in WinEHPrepareAndrew Kaylor2015-03-051-1/+3
| | | | llvm-svn: 231405
* SDAG: Merge the meat of two ExpandAtomic implementations.Benjamin Kramer2015-03-053-212/+42
| | | | | | | The copies already diverged, don't let them become any worse. Reduce redundancy in code with a little macro metaprogramming. llvm-svn: 231401
* Use the correct func begin symbol in all places in ppc.Rafael Espindola2015-03-051-7/+9
| | | | | | I missed an occurrence of the old symbol in my previous patch. llvm-svn: 231398
* Use the generic Lfunc_begin label on ppc.Rafael Espindola2015-03-051-1/+5
| | | | | | This removes yet another custom label to mark the start of a function. llvm-svn: 231390
* X86: Optimize address mode matching for FRAME_ALLOC_RECOVER nodesDavid Majnemer2015-03-051-0/+1
| | | | | | | We know that the absolute symbol will be less than 2GB and thus will always fit. llvm-svn: 231389
* Replace llvm.frameallocate with llvm.frameescapeReid Kleckner2015-03-053-192/+78
| | | | | | | | | | Turns out it's pretty straightforward and simplifies the implementation. Reviewers: andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D8051 llvm-svn: 231386
* [DagCombiner] Allow shuffles to merge through bitcastsSimon Pilgrim2015-03-051-0/+83
| | | | | | | | | | | | Currently shuffles may only be combined if they are of the same type, despite the fact that bitcasts are often introduced in between shuffle nodes (e.g. x86 shuffle type widening). This patch allows a single input shuffle to peek through bitcasts and if the input is another shuffle will merge them, shuffling using the smallest sized type, and re-applying the bitcasts at the inputs and output instead. Dropped old ShuffleToZext test - this patch removes the use of the zext and vector-zext.ll covers these anyhow. Differential Revision: http://reviews.llvm.org/D7939 llvm-svn: 231380
* Revert change r231366 as it broke clang-native-arm-cortex-a9 ↵Igor Laevsky2015-03-053-92/+22
| | | | | | Analysis/properties.m test. llvm-svn: 231374
* AVX-512, SKX: Enabled masked_load/store operations for this target.Elena Demikhovsky2015-03-051-2/+2
| | | | | | | Added lowering for ISD::CONCAT_VECTORS and ISD::INSERT_SUBVECTOR for i1 vectors, it is needed to pass all masked_memop.ll tests for SKX. llvm-svn: 231371
* Teach lowering to correctly handle invoke statepoint and gc results tied to ↵Igor Laevsky2015-03-053-22/+92
| | | | | | | | them. Note that we still can not lower gc.relocates for invoke statepoints. Also it extracts getCopyFromRegs helper function in SelectionDAGBuilder as we need to be able to customize type of the register exported from basic block during lowering of the gc.result. llvm-svn: 231366
* [PBQP] Use a local bit-matrix to speedup searching an edge in the graph.Arnaud A. de Grandmaison2015-03-051-3/+10
| | | | | | | | | Build time (user time) for building llvm+clang+lldb in release mode: - default allocator: 9086 seconds - with PBQP: 9126 seconds - with PBQP + local bit matrix cache: 9097 seconds llvm-svn: 231360
* Remove useless break after return.Frederic Riss2015-03-051-1/+0
| | | | | | Pointed out by Paul Robinson. llvm-svn: 231353
* [MBP] Use range based for-loops throughout this code. Several hadChandler Carruth2015-03-051-140/+108
| | | | | | | already been added and the inconsistency made choosing names and changing code more annoying. Plus, wow are they better for this code! llvm-svn: 231347
* [MBP] NFC, run clang-format over this code and tweak things to make theChandler Carruth2015-03-051-71/+62
| | | | | | | | | | | result reasonable. This code predated clang-format and so there was a reasonable amount of crufty formatting that had accumulated. This should ensure that neither myself nor others end up with formatting-only changes sneaking into other fixes. llvm-svn: 231341
* [MBP] This is no longer 'block-placement2'. ;] The old variants are longChandler Carruth2015-03-051-3/+3
| | | | | | gone, update this code to reflect that. llvm-svn: 231340
* Use the existing begin and end symbol for debug info.Rafael Espindola2015-03-055-27/+11
| | | | llvm-svn: 231338
* [MBP] Revert r231238 which attempted to fix a nasty bug where MBP isChandler Carruth2015-03-051-26/+0
| | | | | | | | | | | | | just arbitrarily interleaving unrelated control flows once they get moved "out-of-line" (both outside of natural CFG ordering and with diamonds that cannot be fully laid out by chaining fallthrough edges). This easy solution doesn't work in practice, and it isn't just a small bug. It looks like a very different strategy will be required. I'm working on that now, and it'll again go behind some flag so that everyone can experiment and make sure it is working well for them. llvm-svn: 231332
* Turn off .debug_pubnames/pubtypes for PS4.Paul Robinson2015-03-051-2/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D8067 llvm-svn: 231322
* Teach DIEInteger to emit FORM_strp and FORM_ref_addr attributes.Frederic Riss2015-03-041-0/+10
| | | | | | | | To be used/tested by llvm-dsymutil. (llvm-dsymutil does a 'static' link, no need for relocations for most things, so it'll just emit raw integers for most attributes) llvm-svn: 231298
* Support standard DWARF TLS opcode; Darwin and PS4 use it.Paul Robinson2015-03-043-2/+17
| | | | | | Differential Revision: http://reviews.llvm.org/D8018 llvm-svn: 231286
* Make DataLayout Non-Optional in the ModuleMehdi Amini2015-03-041-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270
* Revert the test commit.Wei Mi2015-03-041-1/+0
| | | | llvm-svn: 231264
* Test commit. It will be reverted in the next commit.Wei Mi2015-03-041-0/+1
| | | | llvm-svn: 231262
* Fix DwarfExpression::AddMachineRegExpression so it doesn't read past theAdrian Prantl2015-03-041-11/+15
| | | | | | | end of an expression that ends with DW_OP_plus. Caught by the ASAN build bots. llvm-svn: 231260
* Mutate TargetLowering::shouldExpandAtomicRMWInIR to specifically dictate how ↵JF Bastien2015-03-041-9/+24
| | | | | | | | | | | | | | | | | | | | | AtomicRMWInsts are expanded. Summary: In PNaCl, most atomic instructions have their own @llvm.nacl.atomic.* function, each one, with a few exceptions, represents a consistent behaviour across all NaCl-supported targets. Unfortunately, the atomic RMW operations nand, [u]min, and [u]max aren't directly represented by any such @llvm.nacl.atomic.* function. This patch refines shouldExpandAtomicRMWInIR in TargetLowering so that a future `Le32TargetLowering` class can selectively inform the caller how the target desires the atomic RMW instruction to be expanded (ie via load-linked/store-conditional for ARM/AArch64, via cmpxchg for X86/others?, or not at all for Mips) if at all. This does not represent a behavioural change and as such no tests were added. Patch by: Richard Diamond. Reviewers: jfb Reviewed By: jfb Subscribers: jfb, aemerson, t.p.northover, llvm-commits Differential Revision: http://reviews.llvm.org/D7713 llvm-svn: 231250
* [MBP] Fix a really horrible bug in MachineBlockPlacement, but behindChandler Carruth2015-03-041-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a flag for now. First off, thanks to Daniel Jasper for really pointing out the issue here. It's been here forever (at least, I think it was there when I first wrote this code) without getting really noticed or fixed. The key problem is what happens when two reasonably common patterns happen at the same time: we outline multiple cold regions of code, and those regions in turn have diamonds or other CFGs for which we can't just topologically lay them out. Consider some C code that looks like: if (a1()) { if (b1()) c1(); else d1(); f1(); } if (a2()) { if (b2()) c2(); else d2(); f2(); } done(); Now consider the case where a1() and a2() are unlikely to be true. In that case, we might lay out the first part of the function like: a1, a2, done; And then we will be out of successors in which to build the chain. We go to find the best block to continue the chain with, which is perfectly reasonable here, and find "b1" let's say. Laying out successors gets us to: a1, a2, done; b1, c1; At this point, we will refuse to lay out the successor to c1 (f1) because there are still un-placed predecessors of f1 and we want to try to preserve the CFG structure. So we go get the next best block, d1. ... wait for it ... Except that the next best block *isn't* d1. It is b2! d1 is waaay down inside these conditionals. It is much less important than b2. Except that this is exactly what we didn't want. If we keep going we get the entire set of the rest of the CFG *interleaved*!!! a1, a2, done; b1, c1; b2, c2; d1, f1; d2, f2; So we clearly need a better strategy here. =] My current favorite strategy is to actually try to place the block whose predecessor is closest. This very simply ensures that we unwind these kinds of CFGs the way that is natural and fitting, and should minimize the number of cache lines instructions are spread across. It also happens to be *dead simple*. It's like the datastructure was specifically set up for this use case or something. We only push blocks onto the work list when the last predecessor for them is placed into the chain. So the back of the worklist *is* the nearest next block. Unfortunately, a change like this is going to cause *soooo* many benchmarks to swing wildly. So for now I'm adding this under a flag so that we and others can validate that this is fixing the problems described, that it seems possible to enable, and hopefully that it fixes more of our problems long term. llvm-svn: 231238
* Add a flag to experiment with outlining optional branches.Daniel Jasper2015-03-041-2/+46
| | | | | | | | | | | | | In a CFG with the edges A->B->C and A->C, B is an optional branch. LLVM's default behavior is to lay the blocks out naturally, i.e. A, B, C, in order to improve code locality and fallthroughs. However, if a function contains many of those optional branches only a few of which are taken, this leads to a lot of unnecessary icache misses. Moving B out of line can work around this. Review: http://reviews.llvm.org/D7719 llvm-svn: 231230
* [DAGCombine] Fix a bug in a BUILD_VECTOR combineMichael Kuperstein2015-03-041-2/+3
| | | | | | | | | | When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector. We can not do this when we also need the zero vector to create a blend. This fixes PR22774. Differential Revision: http://reviews.llvm.org/D8040 llvm-svn: 231219
* Move emitDIE and emitAbbrevs to AsmPrinter. NFC.Frederic Riss2015-03-045-68/+66
| | | | | | | | | | | | | | (They are called emitDwarfDIE and emitDwarfAbbrevs in their new home) llvm-dsymutil wants to reuse that code, but it doesn't have a DwarfUnit or a DwarfDebug object to call those. It has access to an AsmPrinter though. Having emitDIE in the AsmPrinter also removes the DwarfFile dependency on DwarfDebug, and thus the patch drops that field. Differential Revision: http://reviews.llvm.org/D8024 llvm-svn: 231210
* Constify AsmPrinter passed to DIE methods.Frederic Riss2015-03-041-22/+22
| | | | llvm-svn: 231209
OpenPOWER on IntegriCloud