summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [X86][AVX] Fix VBROADCASTF128 selection bug (PR28770)Simon Pilgrim2016-07-291-7/+25
| | | | | | Support for lowering to VBROADCASTF128 etc. in D22460 was not correctly ensuring that the only users of the 128-bit vector load were the insertions of the vector into the lower/upper subvectors. llvm-svn: 277214
* [msf] Resubmit "Rename Msf -> MSF".Zachary Turner2016-07-2932-136/+136
| | | | | | | | | | | | | Previously this change was submitted from a Windows machine, so changes made to the case of filenames and directory names did not survive the commit, and as a result the CMake source file names and the on-disk file names did not match on case-sensitive file systems. I'm resubmitting this patch from a Linux system, which hopefully allows the case changes to make it through unfettered. llvm-svn: 277213
* CodeGen: add new "intrinsic" MachineOperand kind.Tim Northover2016-07-297-9/+78
| | | | | | | This will be used during GlobalISel, where we need a more robust and readable way to write tests than a simple immediate ID. llvm-svn: 277209
* [LoopUnroll] Include hotness of region in opt remarkAdam Nemet2016-07-293-34/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter is added to the common function analysis passes that loop passes depend on. The BFI and indirectly BPI used in this pass is computed lazily so no overhead should be observed unless -pass-remarks-with-hotness is used. This is how the patch affects the O3 pipeline: Dominator Tree Construction Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Rotate Loops Loop Invariant Code Motion Unswitch loops Simplify the CFG Dominator Tree Construction Basic Alias Analysis (stateless AA impl) Function Alias Analysis Results Combine redundant instructions Natural Loop Information Canonicalize natural loops Loop-Closed SSA Form Pass Scalar Evolution Analysis + Lazy Branch Probability Analysis + Lazy Block Frequency Analysis + Optimization Remark Emitter Loop Pass Manager Induction Variable Simplification Recognize loop idioms Delete dead loops Unroll loops ... llvm-svn: 277203
* Fixed (incorrectly firing) MSVC unused variable warningSimon Pilgrim2016-07-291-2/+1
| | | | llvm-svn: 277198
* [ConstantFolding] Handle bitcasts of undef fp vector elementsDavid Majnemer2016-07-291-1/+1
| | | | | | | | | We used the wrong type for constructing a zero vector element which led to type mismatches. This fixes PR28771. llvm-svn: 277197
* Fixed MSVC out of range shift warningSimon Pilgrim2016-07-291-1/+1
| | | | llvm-svn: 277195
* Revert "[msf] Rename Msf to MSF."Zachary Turner2016-07-2932-136/+136
| | | | | | This reverts commit 4d1557ffac41e079bcb1abbcf04f512474dcd6fe. llvm-svn: 277194
* [msf] Rename Msf to MSF.Zachary Turner2016-07-2932-136/+136
| | | | | | | | In a previous patch, it was suggested to use all caps instead of rolling caps for initialisms, so this patch changes everything to do this. llvm-svn: 277190
* Recommitting r275284: add support to inline __builtin_mempcpyAndrew Kaylor2016-07-294-0/+50
| | | | | | | | Patch by Sunita Marathe Third try, now following fixes to MSan to handle mempcy in such a way that this commit won't break the MSan buildbots. (Thanks, Evegenii!) llvm-svn: 277189
* GlobalISel: make translate* functions take the most specialized class possible.Tim Northover2016-07-291-15/+14
| | | | | | NFC. llvm-svn: 277188
* Codegen: MachineBlockPlacement Improve probability layout.Kyle Butt2016-07-291-15/+45
| | | | | | | | | | | | | | | | | | | | | | | | | The following pattern was being layed out poorly: A / \ B C / \ / \ D E ? (Doesn't matter) Where A->B is far more likely than A->C, and prob(B->D) = prob(B->E) The current algorithm gives: A,B,C,E (D goes on worklist) It does this even if C has a frequency count of 0. This patch adjusts the layout calculation so that if freq(B->E) >> freq(C->E) then we go ahead and layout E rather than C. Fallthrough half the time is better than fallthrough never, or fallthrough very rarely. The resulting layout is: A,B,E, (C and D are in a worklist) llvm-svn: 277187
* GlobalISel: add generic conditional branch.Tim Northover2016-07-292-7/+20
| | | | | | | Just the basic equivalent to DAG's condbr for now, we'll get to things like br_cc when we start doing more legalization. llvm-svn: 277184
* Revert r277178, the actual change had already been appliedKrzysztof Parzyszek2016-07-291-0/+1
| | | | | | Will submit another patch with the testcase only. llvm-svn: 277180
* [Hexagon] Misaligned loads and stores are not fastKrzysztof Parzyszek2016-07-291-1/+0
| | | | | | | | | | | | The DAG combiner tries to merge stores to adjacent vector wide memory locations by creating stores which are integral multiples of the vector width. Discourage this by informing it that this is slow. This should not affect legalization passes, because all of them ignore the "Fast" argument. Patch by Pranav Bhandarkar. llvm-svn: 277178
* The next step along the way to getting good error messages for bad archives.Kevin Enderby2016-07-292-89/+230
| | | | | | | | | | | | | | | | | | | | | | | | As mentioned in commit log for r276686 this next step is adding a new method in the ArchiveMemberHeader class to get the full name that does proper error checking, and can be use for error messages. To do this the name of ArchiveMemberHeader::getName() is changed to ArchiveMemberHeader::getRawName() to be consistent with Archive::Child::getRawName(). Then the “new” method is the addition of a new implementation of ArchiveMemberHeader::getName() which gets the full name and provides proper error checking. Which is mostly a rewrite of what was Archive::Child::getName() and cleaning up incorrect uses of llvm_unreachable() in the code which were actually just cases of errors in the input Archives. Then Archive::Child::getName() is changed to return Expected<> and use the new implementation of ArchiveMemberHeader::getName() . Also needed to change Archive::getMemoryBufferRef() with these changes to return Expected<> as well to propagate Errors up. As well as changing Archive::isThinMember() to return Expected<> . llvm-svn: 277177
* CodeGen: improve MachineInstrBuilder & MachineIRBuilder interfaceTim Northover2016-07-293-53/+56
| | | | | | | | | | | | | | For MachineInstrBuilder, having to manually use RegState::Define is ugly and makes register definitions clunkier than they need to be, so this adds two convenience functions: addDef and addUse. For MachineIRBuilder, we want to avoid BuildMI's first-reg-is-def rule because it's hidden away and causes bugs. So this patch switches buildInstr to returning a MachineInstrBuilder and adding *all* operands via addDef/addUse. NFC. llvm-svn: 277176
* [AArch64][GlobalISel] Select G_XOR.Ahmed Bougacha2016-07-291-0/+5
| | | | llvm-svn: 277173
* [GlobalISel] Add G_XOR.Ahmed Bougacha2016-07-291-0/+2
| | | | llvm-svn: 277172
* [AArch64][GlobalISel] Select G_LOAD/G_STORE.Ahmed Bougacha2016-07-292-2/+62
| | | | | | | | | | Mostly straightforward as we ignore addressing modes and just use the base + unsigned immediate offset (always 0) variants. This currently fails to select extloads because we have yet to agree on a representation. llvm-svn: 277171
* MachinePipeliner pass that implements Swing Modulo SchedulingBrendon Cahoon2016-07-296-4/+4084
| | | | | | | | | | | | | | | | | | | | | | | | Software pipelining is an optimization for improving ILP by overlapping loop iterations. Swing Modulo Scheduling (SMS) is an implementation of software pipelining that attempts to reduce register pressure and generate efficient pipelines with a low compile-time cost. This implementaion of SMS is a target-independent back-end pass. When enabled, the pass should run just prior to the register allocation pass, while the machine IR is in SSA form. If the pass is successful, then the original loop is replaced by the optimized loop. The optimized loop contains one or more prolog blocks, the pipelined kernel, and one or more epilog blocks. This pass is enabled for Hexagon only. To enable for other targets, a couple of target specific hooks must be implemented, and the pass needs to be called from the target's TargetMachine implementation. Differential Review: http://reviews.llvm.org/D16829 llvm-svn: 277169
* [Hexagon] Custom lower VECTOR_SHUFFLE and EXTRACT_SUBVECTOR for HVXKrzysztof Parzyszek2016-07-293-22/+199
| | | | | | | | | | | | | | | | If the mask of a vector shuffle has alternating odd or even numbers starting with 1 or 0 respectively up to the largest possible index for the given type in the given HVX mode (single of double) we can generate vpacko or vpacke instruction respectively. E.g. %42 = shufflevector <32 x i16> %37, <32 x i16> %41, <32 x i32> <i32 1, i32 3, ..., i32 63> is %42.h = vpacko(%41.w, %37.w) Patch by Pranav Bhandarkar. llvm-svn: 277168
* Initial support for vectorization using svml (short vector math library).Matt Masten2016-07-291-1/+71
| | | | | | Differential Revision: https://reviews.llvm.org/D19544 llvm-svn: 277166
* [Hexagon] Improve balancing of address calculationKrzysztof Parzyszek2016-07-291-3/+738
| | | | | | | | | Rebalances address calculation trees and applies Hexagon-specific optimizations to the trees to improve instruction selection. Patch by Tobias Edler von Koch. llvm-svn: 277151
* Avoid unnecessary 32-bit to 64-bit zero extensions followingDavid L Kreitzer2016-07-291-4/+2
| | | | | | | | | 32-bit CMOV instructions on x86_64. The 32-bit CMOV implicitly zero extends. Differential Revision: https://reviews.llvm.org/D22941 llvm-svn: 277148
* [MC] When emitting output hash comments always use standard line comment ↵Nirav Dave2016-07-292-4/+5
| | | | | | seperator llvm-svn: 277146
* Fix license information in the file headerKrzysztof Parzyszek2016-07-291-2/+5
| | | | llvm-svn: 277145
* Add missing files to r277143Krzysztof Parzyszek2016-07-292-0/+213
| | | | llvm-svn: 277144
* [Hexagon] Implement DFA based hazard recognizerKrzysztof Parzyszek2016-07-292-3/+11
| | | | | | | | | | | The post register allocator scheduler can generate poor schedules because the scoreboard hazard recognizer is unable to identify hazards for Hexagon precisely. Instead, Hexagon should use a DFA based hazard recognizer. Patch by Brendon Cahoon. llvm-svn: 277143
* Re-commit: [mips][fastisel] Handle 0-4 arguments without SelectionDAG.Daniel Sanders2016-07-291-2/+158
| | | | | | | | | | | | | | | | | | | | | | Summary: Implements fastLowerArguments() to avoid the need to fall back on SelectionDAG for 0-4 argument functions that don't do tricky things like passing double in a pair of i32's. This allows us to move all except one test to -fast-isel-abort=3. The remaining one has function prototypes of the form 'i32 (i32, double, double)' which requires floats to be passed in GPR's. The previous commit had an uninitialized variable that caused the incoming argument region to have undefined size. This has been fixed. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D22680 llvm-svn: 277136
* Cleanup TransferDbgValuesNirav Dave2016-07-291-2/+9
| | | | | | | | | | | | | | | [DAG] Check debug values for invalidation before transferring and mark old debug values invalid when transferring to another SDValue. This fixes PR28613. Reviewers: jyknight, hans, dblaikie, echristo Subscribers: yaron.keren, ismail, llvm-commits Differential Revision: https://reviews.llvm.org/D22858 llvm-svn: 277135
* [X86][SSE] Optimize the truncation of vector comparison results with PACKSSSimon Pilgrim2016-07-291-2/+139
| | | | | | | | | | | | We currently default to using either generic shuffles or MASK+PACKUS/PACKSS to truncate all integer vectors. For vector comparisons, we know that the result will be either all or zero bits in every element, which can be efficiently truncated by directly using PACKSS to repeatedly halve the size of each element. Due to the limited input values (-1 or 0) we don't need to account for vector element size, so for simplicity we just use the PACKSS(vXi16,vXi16) implementation in all cases. Additionally for AVX2 PACKSS of 256bit data we must perform a PERMQ shuffle to reorder the data into the correct order. I did investigate performing a single shuffle after all the PACKSS calls but the need to cross 128bit lanes makes this difficult to achieve efficiently. We avoid performing this on AVX512 as it should have better alternative truncation instructions. Differential Revision: https://reviews.llvm.org/D22814 llvm-svn: 277132
* Fixed MSVC out of range shift warningSimon Pilgrim2016-07-291-1/+1
| | | | llvm-svn: 277130
* Fix for commit rL277126 that broke a build.Sjoerd Meijer2016-07-291-1/+1
| | | | llvm-svn: 277129
* [Thumb] Emit Thumb move in both Thumb modes for struct_byval predicatesPrakhar Bahuguna2016-07-291-4/+5
| | | | | | | | | | | | | | | | | Summary: The MOV/MOVT instructions being chosen for struct_byval predicates was conditional only on Thumb2, resulting in an ARM MOV/MOVT instruction being incorrectly emitted in Thumb1 mode. This is especially apparent with v8-m.base targets. This patch ensures that Thumb instructions are emitted in both Thumb modes. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D22865 llvm-svn: 277128
* [lanai] Update for Target API (TargetRegistry::RegisterMCAsmBackend) changeJacques Pienaar2016-07-292-5/+7
| | | | llvm-svn: 277127
* TargetInstrInfo: add virtual function getInstSizeInBytesSjoerd Meijer2016-07-298-8/+8
| | | | | | | | | This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of subclasses already implement. Differential Revision: https://reviews.llvm.org/D22885 llvm-svn: 277126
* [AVX512] Mark EVEX VMOVSSrm and VMOVSDrm as canFoldAsLoad and ↵Craig Topper2016-07-292-0/+3
| | | | | | isReMaterializable. llvm-svn: 277120
* [AVX512] Copy the patterns that recognize scalar arimetic operations ↵Craig Topper2016-07-292-2/+105
| | | | | | inserting into the lower element of a packed vector from AVX/SSE so that we can use EVEX encoded instructions. llvm-svn: 277119
* [EarlyCSE] Correctly handle simplified, but live, instructionsDavid Majnemer2016-07-291-2/+4
| | | | | | | | | Some instructions may have their uses replaced with a symbolic constant. However, the instruction may still have side effects which percludes it from being removed from the function. EarlyCSE treated such an instruction as if it were removed, resulting in PR28763. llvm-svn: 277114
* [ConstantFolding] Fold bitcasts of vectors w/ undef elementsDavid Majnemer2016-07-291-2/+11
| | | | | | | | An undef vector element can be treated as if it had any value. Folding such a vector element to 0 in a bitcast can open up further folding opportunities. llvm-svn: 277104
* [ConstantFolding] Remove an unused ConstantFoldInstOperands overloadDavid Majnemer2016-07-291-14/+5
| | | | | | No functional change is intended. llvm-svn: 277101
* [ConstantFolding] Use ConstantExpr::getWithOperandsDavid Majnemer2016-07-291-0/+3
| | | | | | | | | ConstantExpr::getWithOperands does much of the hard work that ConstantFoldInstOperandsImpl tries to do but more completely. This lets us fold ExtractValue/InsertValue expressions. llvm-svn: 277100
* [ConstnatFolding] Teach the folder how to fold ConstantVectorDavid Majnemer2016-07-2910-98/+105
| | | | | | | | | | | A ConstantVector can have ConstantExpr operands and vice versa. However, the folder had no ability to fold ConstantVectors which, in some cases, was an optimization barrier. Instead, rephrase the folder in terms of Constants instead of ConstantExprs and teach callers how to deal with failure. llvm-svn: 277099
* [AVX512] Remove the intrinsic forms of VMOVSS/VMOVSD. We don't need two ↵Craig Topper2016-07-291-29/+43
| | | | | | | | different forms of 'rr' and 'rm'. This matches SSE/AVX. I'm not convinced the patterns for the rm_Int was correct anyway. It had a tied source that should't exist for the unmasked version. The load form of MOVSS always zeros the most significant bits. I've left the patterns off the masked load instructions as I'm not sure what the correct pattern should be and we don't have any tests currently. Nor do we implement masked scalar load intrinsics in clang currently. llvm-svn: 277098
* [CFLAA] Check for pointer types in more places.George Burgess IV2016-07-291-2/+4
| | | | | | | | | | | | | | This patch fixes an assertion that fires when we try to add non-pointer Values to the CFLGraph. Centralizing the check for whether something is/isn't a pointer type isn't completely trivial (and, in some cases, would end up being entirely redundant), but it may be beneficial to do so if this trips us up more in the future. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22947 llvm-svn: 277096
* Added ThinLTO inlining statisticsPiotr Padlewski2016-07-293-9/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: copypasta doc of ImportedFunctionsInliningStatistics class \brief Calculate and dump ThinLTO specific inliner stats. The main statistics are: (1) Number of inlined imported functions, (2) Number of imported functions inlined into importing module (indirect), (3) Number of non imported functions inlined into importing module (indirect). The difference between first and the second is that first stat counts all performed inlines on imported functions, but the second one only the functions that have been eventually inlined to a function in the importing module (by a chain of inlines). Because llvm uses bottom-up inliner, it is possible to e.g. import function `A`, `B` and then inline `B` to `A`, and after this `A` might be too big to be inlined into some other function that calls it. It calculates this statistic by building graph, where the nodes are functions, and edges are performed inlines and then by marking the edges starting from not imported function. If `Verbose` is set to true, then it also dumps statistics per each inlined function, sorted by the greatest inlines count like - number of performed inlines - number of performed inlines to importing module Reviewers: eraman, tejohnson, mehdi_amini Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D22491 llvm-svn: 277089
* Revert "Don't invoke getName() from Function::isIntrinsic().", rL276942.Justin Lebar2016-07-284-12/+11
| | | | | | | | | This broke some out-of-tree AMDGPU tests that relied on the old behavior wherein isIntrinsic() would return true for any function that starts with "llvm.". And in general that change will not play nicely with out-of-tree backends. llvm-svn: 277087
* [sanitizer] Simplify and future-proof maybeMarkSanitizerLibraryCallNoBuiltin().Evgeniy Stepanov2016-07-281-17/+6
| | | | | | | | | | | | | | | | | | | | Sanitizers set nobuiltin attribute on certain library functions to avoid a situation where such function is neither instrumented nor intercepted. At the moment the list of interesting functions is hardcoded. This change replaces it with logic based on TargetLibraryInfo::hasOptimizedCodegen and the presense of readnone function attribute (sanitizers are generally interested in memory behavior of library functions). This is expected to be a no-op change: the new logic matches exactly the same set of functions. r276771 (currently reverted) added mempcpy() to the list, breaking MSan tests. With this change, r276771 can be safely re-landed. llvm-svn: 277086
* [IR] Introduce a non-integral pointer typeSanjoy Das2016-07-282-0/+24
| | | | | | | | | | | | | | | Summary: This change adds a `ni` specifier in the `datalayout` string to denote pointers in some given address spaces as "non-integral", and adds some typing rules around these special pointers. Reviewers: majnemer, chandlerc, atrick, dberlin, eli.friedman, tstellarAMD, arsenm Subscribers: arsenm, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22488 llvm-svn: 277085
OpenPOWER on IntegriCloud