summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [x86] Restructure the comments and the conditions for handlingChandler Carruth2015-02-261-13/+19
| | | | | | | | | | | | | | dynamic blends. This makes it much more clear what is going on. The case we're handling is that of dynamic conditions, and we're bailing when the nature of the vector types and subtarget preclude lowering the dynamic condition vselect as an actual blend. No functionality changed here, but this will make a subsequent bug-fix to this code much more clear. llvm-svn: 230690
* [x86] Re-order the combines of select in the X86 backend. This doesn'tChandler Carruth2015-02-261-19/+19
| | | | | | | change functionality, but makes it more clear that the dynamic case and the shuffle case don't overlap in any interesting way. llvm-svn: 230689
* [x86] Add an assert to catch if we ever try to blend a v32i8 withoutChandler Carruth2015-02-261-0/+3
| | | | | | AVX2. llvm-svn: 230688
* Silence some Win64 clang-cl warnings about unused stuff due to ifdefsReid Kleckner2015-02-261-1/+2
| | | | llvm-svn: 230685
* Use wider type for overflow check on LLP64 platforms like Win64, found by ↵Reid Kleckner2015-02-261-2/+2
| | | | | | clang-cl -Wtautological llvm-svn: 230684
* InstrProf: Simplify the construction of BinaryCoverageReaderJustin Bogner2015-02-262-64/+61
| | | | | | | | | | | | | Creating BinaryCoverageReader is a strange and complicated dance where the constructor sets error codes that member functions will later read, and the object is in an invalid state if readHeader isn't immediately called after construction. Instead, make the constructor private and add a static create method to do the construction properly. This also has the benefit of removing readHeader completely and simplifying the interface of the object. llvm-svn: 230676
* InstrProf: Rename ObjectFileCoverageMappingReader to BinaryCoverageReaderJustin Bogner2015-02-262-6/+6
| | | | | | | The current name is long and confusing. A shorter one is both easier to understand and easier to work with. llvm-svn: 230675
* SCEVExpander incorrectly marks generated subtractions as nuw/nswSanjoy Das2015-02-261-3/+6
| | | | | | | | | | | | | | | It is not sound to mark the increment operation as `nuw` or `nsw` based on a proof off of the add recurrence if the increment operation we emit happens to be a `sub` instruction. I could not come up with a test case for this -- the cases where SCEVExpander decides to emit a `sub` instruction is quite small, and I cannot think of a way I'd be able to get SCEV to prove that the increment does not overflow in those cases. Differential Revision: http://reviews.llvm.org/D7899 llvm-svn: 230673
* [MC] Use the non-EH register mapping in the debug_frame section.Frederic Riss2015-02-261-4/+20
| | | | | | | | | | | | | | | | | | | | On 32bits x86 Darwin, the register mappings for the eh_frane and debug_frame sections are different. Thus the same CFI instructions should result in different registers in the object file. The problem isn't target specific though, but it requires that the mappings for EH register numbers be different from the standard Dwarf one. The patch looks a bit clumsy. LLVM uses the EH mapping as canonical for everything frame related. Thus we need to do a double conversion EH -> LLVM -> Non-EH, when emitting the debug_frame section. Fixes PR22363. Differential Revision: http://reviews.llvm.org/D7593 llvm-svn: 230670
* Don't sibcall between SysV and Win64 convention functionsReid Kleckner2015-02-261-0/+6
| | | | | | | | The shadow stack space expectations won't match. Fixes PR22709. llvm-svn: 230667
* [InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into ↵Hal Finkel2015-02-261-0/+38
| | | | | | | | | | loads/stores InstCombine has long had logic to convert aligned Altivec load/store intrinsics into regular loads and stores. This mirrors that functionality for QPX vector load/store intrinsics. llvm-svn: 230660
* When the source has a series of assignments, users reasonably want toPaul Robinson2015-02-261-0/+3
| | | | | | | | | | | | have the debugger step through each one individually. Turn off the combine for adjacent stores at -O0 so we get this behavior. Possibly, DAGCombine shouldn't run at all at -O0, but that's for another day; see PR22346. Differential Revision: http://reviews.llvm.org/D7181 llvm-svn: 230659
* Fix justify error for small structures in varargs for MIPS64BEPetar Jovanovic2015-02-261-0/+4
| | | | | | | | | | | | | | | There was a problem when passing structures as variable arguments. The structures smaller than 64 bit were not left justified on MIPS64 big endian. This is now fixed by shifting the value to make it left- justified when appropriate. This fixes the bug http://llvm.org/bugs/show_bug.cgi?id=21608 Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D7881 llvm-svn: 230657
* Use ".arch_extension" ARM directive to support hwdiv on kraitSumanth Gundapaneni2015-02-261-3/+12
| | | | | | | | | | | In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the features bits are not computed. This patch lets the asm printer emit ".cpu cortex-a9" directive for krait and the hwdiv feature is enabled through ".arch_extension". In short, krait is treated as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since it is not supported bu GNU GAS yet llvm-svn: 230651
* Use ".arch_extension" ARM directive to specify the additional CPU featuresSumanth Gundapaneni2015-02-264-0/+75
| | | | | | | | | | | This patch is in response to r223147 where the avaiable features are computed based on ".cpu" directive. This will work clean for the standard variants like cortex-a9. For custom variants which rely on standard cpu names for assembly, the additional features of a CPU should be propagated. This can be done via ".arch_extension" as long as the assembler supports it. The implementation for krait along with unit test will be submitted in next patch. llvm-svn: 230650
* [LV/LoopAccesses] Backward dependences are not safe just because theAdam Nemet2015-02-261-2/+2
| | | | | | | | | | | accesses are via different types Noticed this while generalizing the code for loop distribution. I confirmed with Arnold that this was indeed a bug and managed to create a testcase. llvm-svn: 230647
* R600/SI: Remove M0 from DS assembly stringsTom Stellard2015-02-261-8/+8
| | | | | | This matches the assembly syntax for the proprietary compiler. llvm-svn: 230645
* [X86][Haswell][SchedModel] Fix WriteMULm latency.Michael Kuperstein2015-02-261-1/+1
| | | | | | | The latency for the WriteMULm class was set to 4, which is actually lower than the latency for WriteMULr (5). A better estimate would be 4 added to WriteMULr, that is, 9. llvm-svn: 230634
* [x86] Sink the single-input v8i16 lowering code that is actuallyChandler Carruth2015-02-261-24/+26
| | | | | | | | | | formulaic into the top v8i16 lowering routine. This makes the generalized lowering a completely general and single path lowering which will allow generalizing it in turn for multiple 128-bit lanes. llvm-svn: 230623
* [x86] Remove a SimpleTy usage. No need for it here, we already have theChandler Carruth2015-02-261-2/+2
| | | | | | MVT. llvm-svn: 230622
* IRCE: only touch loops that have been shown to have a highSanjoy Das2015-02-261-4/+17
| | | | | | backedge-taken count in profiliing data. llvm-svn: 230619
* IRCE: generalize to handle loops with decreasing induction variables.Sanjoy Das2015-02-261-207/+358
| | | | | | | | | IRCE can now split the iteration space for loops like: for (i = n; i >= 0; i--) a[i + k] = 42; // bounds check on access llvm-svn: 230618
* [x86] Make the vector shuffle helpers order the SDLoc and MVT arguments.Chandler Carruth2015-02-261-27/+27
| | | | | | This ordering matches that of DAG.getNode. llvm-svn: 230617
* [LoopAccesses] Add command-line option for RuntimeMemoryCheckThresholdAdam Nemet2015-02-261-10/+11
| | | | | | | | Also remove the somewhat misleading initializers from VectorizationFactor and VectorizationInterleave. They will get initialized with the default ctor since no cl::init is provided. llvm-svn: 230608
* IRCE: print newline after printing an InductiveRangeCheck.Sanjoy Das2015-02-261-0/+1
| | | | llvm-svn: 230607
* Pass /nologo to ml64 for quieter buildsReid Kleckner2015-02-261-1/+1
| | | | | | | It still prints "Assembling path/to/X86CompilationCallback_Win64.asm", but linking does the same thing. llvm-svn: 230596
* PlaceSafepoints: use IRBuilder helpersRamkumar Ramachandra2015-02-262-44/+50
| | | | | | | | | | Use the IRBuilder helpers for gc.statepoint and gc.result, instead of coding the construction by hand. Note that the gc.statepoint IRBuilder handles only CallInst, not InvokeInst; retain that part of hand-coding. Differential Revision: http://reviews.llvm.org/D7518 llvm-svn: 230591
* Remove a FIXME.Eric Christopher2015-02-261-1/+0
| | | | | | | | | | Explanation: This function is in TargetLowering because it uses RegClassForVT which would need to be moved to TargetRegisterInfo and would necessitate moving isTypeLegal over as well - a massive change that would just require TargetLowering having a TargetRegisterInfo class member that it would use. llvm-svn: 230585
* Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.Eric Christopher2015-02-2622-37/+45
| | | | | | | | | This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583
* MemDepPrinter: Fix some nits introduced in r228596Ramkumar Ramachandra2015-02-251-3/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D7644 llvm-svn: 230582
* Object: Handle Mach-O kext bundle filesJustin Bogner2015-02-254-0/+4
| | | | | | This particular subtype of Mach-O was missing. Add it. llvm-svn: 230567
* InstrProf: Make the __llvm_profile_runtime_user symbol hiddenJustin Bogner2015-02-251-0/+1
| | | | | | | | | This symbol exists only to pull in the required pieces of the runtime, so nothing ever needs to refer to it. Making it hidden avoids the potential for issues with duplicate symbols when linking profiled libraries together. llvm-svn: 230566
* IR: Drop newline from AssemblyWriter::printMDNodeBody()Duncan P. N. Exon Smith2015-02-251-1/+1
| | | | | | | | | | | Remove a newline from `AssemblyWriter::printMDNodeBody()`, and add one to `AssemblyWriter::writeMDNode()`. NFCI for assembly output. However, this drops an inconsistent newline from `Metadata::print()` when `this` is an `MDNode`. Now the newline added by `Metadata::dump()` won't look so verbose. llvm-svn: 230565
* only propagate equality comparisons of FP values that we are certain are ↵Sanjay Patel2015-02-251-3/+7
| | | | | | | | | | | | non-zero This is a follow-on to r227491 which tightens the check for propagating FP values. If a non-constant value happens to be a zero, we would hit the same bug as before. Bug noted and patch suggested by Eli Friedman. llvm-svn: 230564
* InstrProf: Remove dead code in CoverageMappingReaderJustin Bogner2015-02-251-13/+3
| | | | | | | Remove a default argument that's never passed and a constructor that's never called. llvm-svn: 230563
* Move TargetLoweringBase::getTypeConversion to the .cpp file fromEric Christopher2015-02-251-0/+132
| | | | | | | | | the .h file. It's used in only one place (other than recursively) and there's no need to include it everywhere. Saves almost 900k from total llvm object file size. llvm-svn: 230561
* InstCombine: extract instead of shuffle when performing vector/array type ↵JF Bastien2015-02-251-5/+116
| | | | | | | | | | | | | punning Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is. Test Plan: make check Reviewers: jvoung, chandlerc Subscribers: llvm-commits llvm-svn: 230560
* [dwarfdump] Fix frame info register number dump.Frederic Riss2015-02-251-1/+1
| | | | llvm-svn: 230559
* IR: Annotate dump methods with LLVM_DUMP_METHODDuncan P. N. Exon Smith2015-02-251-0/+6
| | | | | | | | | It turns out we have a macro to ensure that debuggers can access `dump()` methods. Use it. Hopefully this will prevent me (and others) from committing crimes like in r223802 (search for /10000/, or just see the fix in r224407). llvm-svn: 230555
* Try to appease buildbots.Frederic Riss2015-02-251-1/+1
| | | | | | It seems ArrayRefs to multi-dimensional arrays confuse some compilers. llvm-svn: 230554
* [PowerPC] Make LDtocL and friends invariant loadsHal Finkel2015-02-254-34/+54
| | | | | | | | | | | | | | | | | | | | | LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node, represent loads from the TOC, which is invariant. As a result, these loads can be hoisted out of loops, etc. In order to do this, we need to generate GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory intrinsic node type. Once this is done, the MMO transfer is automatically handled for TableGen-driven instruction selection, and for nodes generated directly in PPCISelDAGToDAG, we need to transfer the MMOs manually. Also, we were not transferring MMOs associated with pre-increment loads, so do that too. Lastly, this fixes an exposed bug where R30 was not added as a defined operand of UpdateGBR. This problem was highlighted by an example (used to generate the test case) posted to llvmdev by Francois Pichet. llvm-svn: 230553
* [dwarfdump] Make debug_frame dump actually useful.Frederic Riss2015-02-251-3/+136
| | | | | | | | | | | | | | | | | | | | | | | | This adds support for pretty-printing instruction operands. The new output looks like: 00000000 00000010 ffffffff CIE Version: 1 Augmentation: Code alignment factor: 1 Data alignment factor: -4 Return address column: 8 DW_CFA_def_cfa: reg4 +4 DW_CFA_offset: reg8 -4 DW_CFA_nop: DW_CFA_nop: 00000014 00000010 00000000 FDE cie=00000000 pc=00000000...00000022 DW_CFA_advance_loc: 3 DW_CFA_def_cfa_offset: +12 DW_CFA_nop: llvm-svn: 230551
* [dwarfdump] Don't print meaningless pointer.Frederic Riss2015-02-251-3/+0
| | | | | | | CIE pointers were never filled in before, and printing the pointer is totally pointless anyway. llvm-svn: 230550
* DWARFDebugFrame: Move some code around. NFC.Frederic Riss2015-02-251-13/+10
| | | | | | | Move the FrameEntry::dumpInstructions down in the file at some place where it can see the declarations of FDE and CIE. llvm-svn: 230549
* DWARFDebugFrame: Add some trivial accessors. NFC.Frederic Riss2015-02-251-0/+5
| | | | | | To be used for dumping. llvm-svn: 230548
* DWARFDebugFrame: Actually collect CIEs associated with FDEs.Frederic Riss2015-02-251-6/+11
| | | | | | | | This is the first commit in a small series aiming at making debug_frame dump more useful (right now it prints a list of opeartions without their operands). llvm-svn: 230547
* [LTO API] fix memory leakage introduced at r230290.Manman Ren2015-02-251-4/+15
| | | | | | | | r230290 released the LLVM module but not the LTOModule. rdar://19024554 llvm-svn: 230544
* X86, Win64: Allow 'mov' to restore the stack pointer if we have a FPDavid Majnemer2015-02-251-13/+12
| | | | | | | | | | | | | | | | | | The Win64 epilogue structure is very restrictive, it permits a very small number of opcodes and none of them are 'mov'. This means that given: mov %rbp, %rsp pop %rbp The mov isn't the epilogue, only the pop is. This is problematic unless a frame pointer is present in which case we are free to do whatever we'd like in the "body" of the function. If a frame pointer is present, unwinding will undo the prologue operations in reverse order regardless of the fact that we are at an instruction which is reseting the stack pointer. llvm-svn: 230543
* LowerBitSets: Align referenced globals.Peter Collingbourne2015-02-251-22/+40
| | | | | | | | | | | | | | | | This change aligns globals to the next highest power of 2 bytes, up to a maximum of 128. This makes it more likely that we will be able to compress bit sets with a greater alignment. In many more cases, we can now take advantage of a new optimization also introduced in this patch that removes bit set checks if the bit set is all ones. The 128 byte maximum was found to provide the best tradeoff between instruction overhead and data overhead in a recent build of Chromium. It allows us to remove ~2.4MB of instructions at the cost of ~250KB of data. Differential Revision: http://reviews.llvm.org/D7873 llvm-svn: 230540
* Fixing a problem with insert location in WinEH outliningAndrew Kaylor2015-02-251-0/+1
| | | | llvm-svn: 230535
OpenPOWER on IntegriCloud