summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Recommitting r275284: add support to inline __builtin_mempcpyAndrew Kaylor2016-07-299-1/+88
| | | | | | | | Patch by Sunita Marathe Third try, now following fixes to MSan to handle mempcy in such a way that this commit won't break the MSan buildbots. (Thanks, Evegenii!) llvm-svn: 277189
* GlobalISel: make translate* functions take the most specialized class possible.Tim Northover2016-07-292-18/+17
| | | | | | NFC. llvm-svn: 277188
* Codegen: MachineBlockPlacement Improve probability layout.Kyle Butt2016-07-292-15/+213
| | | | | | | | | | | | | | | | | | | | | | | | | The following pattern was being layed out poorly: A / \ B C / \ / \ D E ? (Doesn't matter) Where A->B is far more likely than A->C, and prob(B->D) = prob(B->E) The current algorithm gives: A,B,C,E (D goes on worklist) It does this even if C has a frequency count of 0. This patch adjusts the layout calculation so that if freq(B->E) >> freq(C->E) then we go ahead and layout E rather than C. Fallthrough half the time is better than fallthrough never, or fallthrough very rarely. The resulting layout is: A,B,E, (C and D are in a worklist) llvm-svn: 277187
* Tests: Add branch weights to non-layout tests.Kyle Butt2016-07-294-12/+20
| | | | | | | Add branch weights to a few tests that aren't testing layout to make them less sensitive to changes in the layout algorithm. llvm-svn: 277186
* GlobalISel: add generic conditional branch.Tim Northover2016-07-296-7/+75
| | | | | | | Just the basic equivalent to DAG's condbr for now, we'll get to things like br_cc when we start doing more legalization. llvm-svn: 277184
* [Hexagon] Testcase for not merging stores into a misaligned storeKrzysztof Parzyszek2016-07-291-0/+46
| | | | | | | | | | | The DAG combiner will try to merge consecutive stores into a bigger store, unless the resulting store is not fast. Misaligned vector stores are allowed on Hexagon, but are not fast. Add a testcase to make sure this type of merging does not occur. Patch by Pranav Bhandarkar. llvm-svn: 277182
* Revert r277178, the actual change had already been appliedKrzysztof Parzyszek2016-07-292-46/+1
| | | | | | Will submit another patch with the testcase only. llvm-svn: 277180
* [Hexagon] Misaligned loads and stores are not fastKrzysztof Parzyszek2016-07-292-1/+46
| | | | | | | | | | | | The DAG combiner tries to merge stores to adjacent vector wide memory locations by creating stores which are integral multiples of the vector width. Discourage this by informing it that this is slow. This should not affect legalization passes, because all of them ignore the "Fast" argument. Patch by Pranav Bhandarkar. llvm-svn: 277178
* The next step along the way to getting good error messages for bad archives.Kevin Enderby2016-07-2915-117/+351
| | | | | | | | | | | | | | | | | | | | | | | | As mentioned in commit log for r276686 this next step is adding a new method in the ArchiveMemberHeader class to get the full name that does proper error checking, and can be use for error messages. To do this the name of ArchiveMemberHeader::getName() is changed to ArchiveMemberHeader::getRawName() to be consistent with Archive::Child::getRawName(). Then the “new” method is the addition of a new implementation of ArchiveMemberHeader::getName() which gets the full name and provides proper error checking. Which is mostly a rewrite of what was Archive::Child::getName() and cleaning up incorrect uses of llvm_unreachable() in the code which were actually just cases of errors in the input Archives. Then Archive::Child::getName() is changed to return Expected<> and use the new implementation of ArchiveMemberHeader::getName() . Also needed to change Archive::getMemoryBufferRef() with these changes to return Expected<> as well to propagate Errors up. As well as changing Archive::isThinMember() to return Expected<> . llvm-svn: 277177
* CodeGen: improve MachineInstrBuilder & MachineIRBuilder interfaceTim Northover2016-07-295-120/+96
| | | | | | | | | | | | | | For MachineInstrBuilder, having to manually use RegState::Define is ugly and makes register definitions clunkier than they need to be, so this adds two convenience functions: addDef and addUse. For MachineIRBuilder, we want to avoid BuildMI's first-reg-is-def rule because it's hidden away and causes bugs. So this patch switches buildInstr to returning a MachineInstrBuilder and adding *all* operands via addDef/addUse. NFC. llvm-svn: 277176
* [AArch64][GlobalISel] Select G_XOR.Ahmed Bougacha2016-07-292-0/+56
| | | | llvm-svn: 277173
* [GlobalISel] Add G_XOR.Ahmed Bougacha2016-07-294-0/+36
| | | | llvm-svn: 277172
* [AArch64][GlobalISel] Select G_LOAD/G_STORE.Ahmed Bougacha2016-07-293-2/+168
| | | | | | | | | | Mostly straightforward as we ignore addressing modes and just use the base + unsigned immediate offset (always 0) variants. This currently fails to select extloads because we have yet to agree on a representation. llvm-svn: 277171
* [GlobalISel] Add LLT raw_ostream operator<< overload.Ahmed Bougacha2016-07-291-0/+5
| | | | | | Helpful when debugging; will be used in the following commit. llvm-svn: 277170
* MachinePipeliner pass that implements Swing Modulo SchedulingBrendon Cahoon2016-07-2921-7/+4583
| | | | | | | | | | | | | | | | | | | | | | | | Software pipelining is an optimization for improving ILP by overlapping loop iterations. Swing Modulo Scheduling (SMS) is an implementation of software pipelining that attempts to reduce register pressure and generate efficient pipelines with a low compile-time cost. This implementaion of SMS is a target-independent back-end pass. When enabled, the pass should run just prior to the register allocation pass, while the machine IR is in SSA form. If the pass is successful, then the original loop is replaced by the optimized loop. The optimized loop contains one or more prolog blocks, the pipelined kernel, and one or more epilog blocks. This pass is enabled for Hexagon only. To enable for other targets, a couple of target specific hooks must be implemented, and the pass needs to be called from the target's TargetMachine implementation. Differential Review: http://reviews.llvm.org/D16829 llvm-svn: 277169
* [Hexagon] Custom lower VECTOR_SHUFFLE and EXTRACT_SUBVECTOR for HVXKrzysztof Parzyszek2016-07-297-22/+631
| | | | | | | | | | | | | | | | If the mask of a vector shuffle has alternating odd or even numbers starting with 1 or 0 respectively up to the largest possible index for the given type in the given HVX mode (single of double) we can generate vpacko or vpacke instruction respectively. E.g. %42 = shufflevector <32 x i16> %37, <32 x i16> %41, <32 x i32> <i32 1, i32 3, ..., i32 63> is %42.h = vpacko(%41.w, %37.w) Patch by Pranav Bhandarkar. llvm-svn: 277168
* Initial support for vectorization using svml (short vector math library).Matt Masten2016-07-293-3/+259
| | | | | | Differential Revision: https://reviews.llvm.org/D19544 llvm-svn: 277166
* [GlobalISel] Auto-brief LowLevelType. NFC.Ahmed Bougacha2016-07-291-29/+25
| | | | llvm-svn: 277163
* [GlobalISel] Add LLT::operator!=().Ahmed Bougacha2016-07-292-1/+16
| | | | llvm-svn: 277162
* [GlobalISel] Fix LLT::unsized to match LLT(LabelTy).Ahmed Bougacha2016-07-292-1/+6
| | | | | | | | | When coming from an IR label type, we set a 0 NumElements, but not when constructing an LLT using unsized(), causing comparisons to fail. Pick one variant and fix the other. llvm-svn: 277161
* [GlobalISel] Add unittests for LowLevelType.Ahmed Bougacha2016-07-293-0/+200
| | | | llvm-svn: 277160
* Reinstate optnone test for GVN Hoisting, removed in r276479.Paul Robinson2016-07-291-1/+2
| | | | llvm-svn: 277158
* Remove inline-comment-2.ll until I can debug why it fails on some buildsNirav Dave2016-07-291-23/+0
| | | | llvm-svn: 277152
* [Hexagon] Improve balancing of address calculationKrzysztof Parzyszek2016-07-292-3/+792
| | | | | | | | | Rebalances address calculation trees and applies Hexagon-specific optimizations to the trees to improve instruction selection. Patch by Tobias Edler von Koch. llvm-svn: 277151
* Fix inline-comment-2.ll tripleNirav Dave2016-07-291-2/+3
| | | | llvm-svn: 277149
* Avoid unnecessary 32-bit to 64-bit zero extensions followingDavid L Kreitzer2016-07-292-8/+7
| | | | | | | | | 32-bit CMOV instructions on x86_64. The 32-bit CMOV implicitly zero extends. Differential Revision: https://reviews.llvm.org/D22941 llvm-svn: 277148
* [MC] When emitting output hash comments always use standard line comment ↵Nirav Dave2016-07-293-4/+27
| | | | | | seperator llvm-svn: 277146
* Fix license information in the file headerKrzysztof Parzyszek2016-07-291-2/+5
| | | | llvm-svn: 277145
* Add missing files to r277143Krzysztof Parzyszek2016-07-292-0/+213
| | | | llvm-svn: 277144
* [Hexagon] Implement DFA based hazard recognizerKrzysztof Parzyszek2016-07-292-3/+11
| | | | | | | | | | | The post register allocator scheduler can generate poor schedules because the scoreboard hazard recognizer is unable to identify hazards for Hexagon precisely. Instead, Hexagon should use a DFA based hazard recognizer. Patch by Brendon Cahoon. llvm-svn: 277143
* Re-commit: [mips][fastisel] Handle 0-4 arguments without SelectionDAG.Daniel Sanders2016-07-2925-52/+208
| | | | | | | | | | | | | | | | | | | | | | Summary: Implements fastLowerArguments() to avoid the need to fall back on SelectionDAG for 0-4 argument functions that don't do tricky things like passing double in a pair of i32's. This allows us to move all except one test to -fast-isel-abort=3. The remaining one has function prototypes of the form 'i32 (i32, double, double)' which requires floats to be passed in GPR's. The previous commit had an uninitialized variable that caused the incoming argument region to have undefined size. This has been fixed. Reviewers: sdardis Subscribers: dsanders, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D22680 llvm-svn: 277136
* Cleanup TransferDbgValuesNirav Dave2016-07-291-2/+9
| | | | | | | | | | | | | | | [DAG] Check debug values for invalidation before transferring and mark old debug values invalid when transferring to another SDValue. This fixes PR28613. Reviewers: jyknight, hans, dblaikie, echristo Subscribers: yaron.keren, ismail, llvm-commits Differential Revision: https://reviews.llvm.org/D22858 llvm-svn: 277135
* [X86][SSE] Optimize the truncation of vector comparison results with PACKSSSimon Pilgrim2016-07-293-729/+475
| | | | | | | | | | | | We currently default to using either generic shuffles or MASK+PACKUS/PACKSS to truncate all integer vectors. For vector comparisons, we know that the result will be either all or zero bits in every element, which can be efficiently truncated by directly using PACKSS to repeatedly halve the size of each element. Due to the limited input values (-1 or 0) we don't need to account for vector element size, so for simplicity we just use the PACKSS(vXi16,vXi16) implementation in all cases. Additionally for AVX2 PACKSS of 256bit data we must perform a PERMQ shuffle to reorder the data into the correct order. I did investigate performing a single shuffle after all the PACKSS calls but the need to cross 128bit lanes makes this difficult to achieve efficiently. We avoid performing this on AVX512 as it should have better alternative truncation instructions. Differential Revision: https://reviews.llvm.org/D22814 llvm-svn: 277132
* Fixed MSVC out of range shift warningSimon Pilgrim2016-07-291-1/+1
| | | | llvm-svn: 277130
* Fix for commit rL277126 that broke a build.Sjoerd Meijer2016-07-291-1/+1
| | | | llvm-svn: 277129
* [Thumb] Emit Thumb move in both Thumb modes for struct_byval predicatesPrakhar Bahuguna2016-07-292-4/+34
| | | | | | | | | | | | | | | | | Summary: The MOV/MOVT instructions being chosen for struct_byval predicates was conditional only on Thumb2, resulting in an ARM MOV/MOVT instruction being incorrectly emitted in Thumb1 mode. This is especially apparent with v8-m.base targets. This patch ensures that Thumb instructions are emitted in both Thumb modes. Reviewers: rengolin, t.p.northover Subscribers: llvm-commits, aemerson, rengolin Differential Revision: https://reviews.llvm.org/D22865 llvm-svn: 277128
* [lanai] Update for Target API (TargetRegistry::RegisterMCAsmBackend) changeJacques Pienaar2016-07-292-5/+7
| | | | llvm-svn: 277127
* TargetInstrInfo: add virtual function getInstSizeInBytesSjoerd Meijer2016-07-299-8/+14
| | | | | | | | | This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of subclasses already implement. Differential Revision: https://reviews.llvm.org/D22885 llvm-svn: 277126
* [AVX512] Mark EVEX VMOVSSrm and VMOVSDrm as canFoldAsLoad and ↵Craig Topper2016-07-294-304/+131
| | | | | | isReMaterializable. llvm-svn: 277120
* [AVX512] Copy the patterns that recognize scalar arimetic operations ↵Craig Topper2016-07-292-2/+105
| | | | | | inserting into the lower element of a packed vector from AVX/SSE so that we can use EVEX encoded instructions. llvm-svn: 277119
* [AVX512] Add AVX512 run lines to some tests for scalar fma/add/sub/mul/div ↵Craig Topper2016-07-293-140/+337
| | | | | | and regenerate. Follow up commits will bring AVX512 code up to the same quality as AVX/SSE. llvm-svn: 277118
* [EarlyCSE] Correctly handle simplified, but live, instructionsDavid Majnemer2016-07-292-2/+18
| | | | | | | | | Some instructions may have their uses replaced with a symbolic constant. However, the instruction may still have side effects which percludes it from being removed from the function. EarlyCSE treated such an instruction as if it were removed, resulting in PR28763. llvm-svn: 277114
* [ConstantFolding] Fold bitcasts of vectors w/ undef elementsDavid Majnemer2016-07-292-2/+17
| | | | | | | | An undef vector element can be treated as if it had any value. Folding such a vector element to 0 in a bitcast can open up further folding opportunities. llvm-svn: 277104
* [ConstantFolding] Remove an unused ConstantFoldInstOperands overloadDavid Majnemer2016-07-292-27/+5
| | | | | | No functional change is intended. llvm-svn: 277101
* [ConstantFolding] Use ConstantExpr::getWithOperandsDavid Majnemer2016-07-293-2/+5
| | | | | | | | | ConstantExpr::getWithOperands does much of the hard work that ConstantFoldInstOperandsImpl tries to do but more completely. This lets us fold ExtractValue/InsertValue expressions. llvm-svn: 277100
* [ConstnatFolding] Teach the folder how to fold ConstantVectorDavid Majnemer2016-07-2913-109/+114
| | | | | | | | | | | A ConstantVector can have ConstantExpr operands and vice versa. However, the folder had no ability to fold ConstantVectors which, in some cases, was an optimization barrier. Instead, rephrase the folder in terms of Constants instead of ConstantExprs and teach callers how to deal with failure. llvm-svn: 277099
* [AVX512] Remove the intrinsic forms of VMOVSS/VMOVSD. We don't need two ↵Craig Topper2016-07-293-31/+45
| | | | | | | | different forms of 'rr' and 'rm'. This matches SSE/AVX. I'm not convinced the patterns for the rm_Int was correct anyway. It had a tied source that should't exist for the unmasked version. The load form of MOVSS always zeros the most significant bits. I've left the patterns off the masked load instructions as I'm not sure what the correct pattern should be and we don't have any tests currently. Nor do we implement masked scalar load intrinsics in clang currently. llvm-svn: 277098
* [CFLAA] Check for pointer types in more places.George Burgess IV2016-07-291-2/+4
| | | | | | | | | | | | | | This patch fixes an assertion that fires when we try to add non-pointer Values to the CFLGraph. Centralizing the check for whether something is/isn't a pointer type isn't completely trivial (and, in some cases, would end up being entirely redundant), but it may be beneficial to do so if this trips us up more in the future. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22947 llvm-svn: 277096
* Add LLVM_ENABLE_LLD option to use LLD as C/C++ linker.Eugene Zelenko2016-07-292-0/+7
| | | | | | Differential revision: https://reviews.llvm.org/D22896 llvm-svn: 277093
* Capture stderr when checking for gold versionTeresa Johnson2016-07-291-3/+5
| | | | | | | On MacOS the ld version is emitted to stderr, resulting in lots of messages in the ninja check output. llvm-svn: 277092
OpenPOWER on IntegriCloud