summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [WebAssembly] Only write 32-bits for WebAssembly::OPERAND_OFFSET32Sam Clegg2018-04-041-0/+5
| | | | | | | | | A bug was found where an offset of -1 would generate an encoding of max int64 which is invalid in the binary format. Differential Revision: https://reviews.llvm.org/D45280 llvm-svn: 329238
* AArch64: Implement support for the shadowcallstack attribute.Peter Collingbourne2018-04-045-13/+103
| | | | | | | | | | | | The implementation of shadow call stack on aarch64 is quite different to the implementation on x86_64. Instead of reserving a segment register for the shadow call stack, we reserve the platform register, x18. Any function that spills lr to sp also spills it to the shadow call stack, a pointer to which is stored in x18. Differential Revision: https://reviews.llvm.org/D45239 llvm-svn: 329236
* Don't inline @llvm.icall.branch.funnelVitaly Buka2018-04-041-9/+14
| | | | | | | | | | | | | | Summary: @llvm.icall.branch.funnel is musttail with variable number of arguments. After inlining current backend can't separate call targets from call arguments. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45116 llvm-svn: 329235
* [MemorySSA] Fix spelling errors in MemorySSA.cpp. NFCZhaoshi Zheng2018-04-041-2/+2
| | | | llvm-svn: 329230
* hwasan: add -hwasan-match-all-tag flagEvgeniy Stepanov2018-04-041-0/+11
| | | | | | | | | | | | | | | | Sometimes instead of storing addresses as is, the kernel stores the address of a page and an offset within that page, and then computes the actual address when it needs to make an access. Because of this the pointer tag gets lost (gets set to 0xff). The solution is to ignore all accesses tagged with 0xff. This patch adds a -hwasan-match-all-tag flag to hwasan, which allows to ignore accesses through pointers with a particular pointer tag value for validity. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D44827 llvm-svn: 329228
* [MachineOutliner] Add `useMachineOutliner` target hookJessica Paquette2018-04-043-1/+15
| | | | | | | | | | | | | The MachineOutliner has a bunch of target hooks that will call llvm_unreachable if the target doesn't implement them. Therefore, if you enable the outliner on such a target, it'll just crash. It'd be much better if it'd just *not* run the outliner at all in this case. This commit adds a hook to TargetInstrInfo that returns false by default. Targets that implement the hook make it return true. The outliner checks the return value of this hook to decide whether or not to continue. llvm-svn: 329220
* [Analysis] Support aligned new/delete functions.Eric Fiselier2018-04-042-0/+45
| | | | | | | | | | | | | | | | Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329218
* Revert "[Analysis] Support aligned new/delete functions."Eric Fiselier2018-04-042-45/+0
| | | | | | This reverts commit bee3bbd9bdd3ab3364b8fb0cdb6326bc1ae740e0. llvm-svn: 329217
* [AArch64] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-041-4/+4
| | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: t.p.northover, jmolloy, RKSimon, rengolin Reviewed By: rengolin Subscribers: dexonsmith, rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44853 llvm-svn: 329216
* [Analysis] Support aligned new/delete functions.Eric Fiselier2018-04-042-0/+45
| | | | | | | | | | | | | | | | Summary: Clang's __builtin_operator_new/delete was recently taught about the aligned allocation overloads (r328134). This patch makes LLVM aware of them as well. This allows the compiler to perform certain optimizations including eliding new/delete calls. Reviewers: rsmith, majnemer, dblaikie, vsk, bkramer Reviewed By: bkramer Subscribers: ckennelly, llvm-commits Differential Revision: https://reviews.llvm.org/D44769 llvm-svn: 329215
* [X86] Separate BSWAP32r and BSWAP64r scheduling data in ↵Craig Topper2018-04-045-5/+40
| | | | | | | | SandyBridge/Haswell/Broadwell/Skylake scheduler models. The BSWAP64r version is 2 uops and BSWAP32r is only 1 uop. The regular expressions also looked for a non-existant BSWAP16r. llvm-svn: 329211
* [llvm-pdbutil] Add the ability to explain binary files.Zachary Turner2018-04-043-14/+20
| | | | | | | | | | Using this, you can use llvm-pdbutil to export the contents of a stream to a binary file, then run explain on the binary file so that it treats the offset as an offset into the stream instead of an offset into a file. This makes it easy to compare the contents of the same stream from two different files. llvm-svn: 329207
* [Power9]Legalize and emit code for quad-precision fma instructionsLei Huang2018-04-042-8/+39
| | | | | | | | | | | | | Legalize and emit code for the following quad-precision fma: * xsmaddqp * xsnmaddqp * xsmsubqp * xsnmsubqp Differential Revision: https://reviews.llvm.org/D44843 llvm-svn: 329206
* Fix build breakage from r329201Pavel Labath2018-04-042-5/+5
| | | | | | | | | Some compilers do not like having an enum type and a variable with the same name (AccelTableKind). I rename the variable to TheAccelTableKind. Suggestions for a better name welcome. llvm-svn: 329202
* Re-commit r329179 after fixing build&test issuesPavel Labath2018-04-044-25/+342
| | | | | | | | | | | - MSVC was not OK with a static_assert referencing a non-static member variable, even though it was just in a sizeof(expression). I move the assert into the emit function, where it is probably more useful. - Tests were failing in builds which did not have the X86 target configured. Since this functionality is not target-specific, I have removed the target specifiers from the .ll files. llvm-svn: 329201
* [AMDGPU][MC] Enabled instruction TBUFFER_LOAD_FORMAT_XYZ for SI/CIDmitry Preobrazhensky2018-04-041-1/+1
| | | | | | | | | See bug 36958: https://bugs.llvm.org/show_bug.cgi?id=36958 Differential Revision: https://reviews.llvm.org/D45099 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329197
* Revert r329179 (and follow-up unsuccessful fix attempts 329184, 329186); it ↵Nico Weber2018-04-044-342/+25
| | | | | | doesn't build. llvm-svn: 329190
* [AMDGPU][MC] Added support of 3-element addresses for MIMG instructionsDmitry Preobrazhensky2018-04-041-1/+9
| | | | | | | | | See bug 35999: https://bugs.llvm.org/show_bug.cgi?id=35999 Differential Revision: https://reviews.llvm.org/D45084 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329187
* Attempt to fix bots more after r329179.Nico Weber2018-04-041-1/+1
| | | | llvm-svn: 329186
* Attempt to fix bots after r329179.Nico Weber2018-04-041-2/+2
| | | | llvm-svn: 329184
* Sort targetgen calls in lib/Target/*/CMakeLists.Nico Weber2018-04-0418-84/+92
| | | | | | | | | | | Makes it easier to see mistakes such as the one fixed in r329178 and makes the different target CMakeLists more consistent. Also remove some stale-looking comments from the Nios2 target cmakefile. No intended behavior change. llvm-svn: 329181
* [CodeGen] Generate DWARF v5 Accelerator TablesPavel Labath2018-04-044-25/+342
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds a DwarfAccelTableEmitter class, which generates an accelerator table, as specified in DWARF v5 standard. At the moment it only generates a DIE offset column and (if we are indexing more than one compile unit) a CU column. Indexing type units is not currently supported, as we don't even have the ability to generate DWARF v5-compatible compile units. The implementation is not data-source agnostic like the one generating apple tables. This was not necessary as we currently only have one user of this code, and without a second user it was not obvious to me how to best abstract this. (The difference between these tables and the apple ones is that they need a lot more metadata about the debug info they are indexing). The generation is triggered by the --accel-tables argument, which supersedes the --dwarf-accel-tables arg -- the latter was a simple on-off switch, but not we can choose between two kinds of accelerator tables we can generate. This is tested by parsing the generated tables with llvm-dwarfdump and the DWARFVerifier, and I've also checked that GNU readelf is able to make sense of the tables. Differential Revision: https://reviews.llvm.org/D43286 llvm-svn: 329179
* Remove duplicate tablegen lines from AVR target.Nico Weber2018-04-041-6/+2
| | | | | | | | | | They were added in r285274, in what looks like a merge mishap. AVRGenMCCodeEmitter.inc is the only non-dupe tablegen invocation added in that revision. Also sort the tablegen lines to make this easier to spot in the future. llvm-svn: 329178
* Make helpers static. NFC.Benjamin Kramer2018-04-044-5/+10
| | | | llvm-svn: 329170
* AMDGPU: Dimension-aware image intrinsicsNicolai Haehnle2018-04-046-3/+271
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: These new image intrinsics contain the texture type as part of their name and have each component of the address/coordinate as individual parameters. This is a preparatory step for implementing the A16 feature, where coordinates are passed as half-floats or -ints, but the Z compare value and texel offsets are still full dwords, making it difficult or impossible to distinguish between A16 on or off in the old-style intrinsics. Additionally, these intrinsics pass the 'texfailpolicy' and 'cachectrl' as i32 bit fields to reduce operand clutter and allow for future extensibility. v2: - gather4 supports 2darray images - fix a bug with 1D images on SI Change-Id: I099f309e0a394082a5901ea196c3967afb867f04 Reviewers: arsenm, rampitec, b-sumner Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D44939 llvm-svn: 329166
* StructurizeCFG: Test for branch divergence correctlyNicolai Haehnle2018-04-041-13/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform, so the branch is non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. As discovered after committing an earlier version of this change, this exposes a subtle interaction between this pass and DivergenceAnalysis: since we remove and re-create branch instructions, we can no longer rely on DivergenceAnalysis for branches in subregions that were already processed by the pass. Explicitly remove branch instructions from DivergenceAnalysis to avoid dangling pointers as a matter of defensive programming, and change how we detect non-uniform subregions. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Differential Revision: https://reviews.llvm.org/D43743 llvm-svn: 329165
* AMDGPU: Fix copying i1 value out of loop with non-uniform exitNicolai Haehnle2018-04-044-1/+103
| | | | | | | | | | | | | | | | | | | | | | | | Summary: When an i1-value is defined inside of a loop and used outside of it, we cannot simply use the SGPR bitmask from the loop's last iteration. There are also useful and correct cases of an i1-value being copied between basic blocks, e.g. when a condition is computed outside of a loop and used inside it. The concept of dominators is not sufficient to capture what is going on, so I propose the notion of "lane-dominators". Fixes a bug encountered in Nier: Automata. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743 Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40547 llvm-svn: 329164
* [AArch64] Add patterns matching (fabs (fsub x y)) to (fabd x y)John Brawn2018-04-041-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D44573 llvm-svn: 329163
* [DAGCombine] Improve ReduceLoadWidth for SRLSam Parker2018-04-041-0/+26
| | | | | | | | | | | | | | | | | | Recommitting rL321259. Previosuly this caused an issue with PPCBE but I didn't receieve a reproducer and didn't have the time to follow up. If the issue appears again, please provide a reproducer so I can fix it. Original commit message: If the SRL node is only used by an AND, we may be able to set the ExtVT to the width of the mask, making the AND redundant. To support this, another check has been added in isLegalNarrowLoad which queries whether the load is valid. Differential Revision: https://reviews.llvm.org/D41350 llvm-svn: 329160
* [ARM] Do not convert some vmov instructionsMikhail Maltsev2018-04-041-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Patch https://reviews.llvm.org/D44467 implements conversion of invalid vmov instructions into valid ones. It turned out that some valid instructions also get converted, for example vmov.i64 d2, #0xff00ff00ff00ff00 -> vmov.i16 d2, #0xff00 Such behavior is incorrect because according to the ARM ARM section F2.7.7 Modified immediate constants in T32 and A32 Advanced SIMD instructions, "On assembly, the data type must be matched in the table if possible." This patch fixes the isNEONmovReplicate check so that the above instruction is not modified any more. Reviewers: rengolin, olista01 Reviewed By: rengolin Subscribers: javed.absar, kristof.beyls, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D44678 llvm-svn: 329158
* [X86] Use the same predicate for the load for PMOVSXBQ and PMOVZXBQ.Craig Topper2018-04-043-18/+10
| | | | | | These both use a 16-bit load, but one used loadi16_anyext and the other used extloadi32i16. The only difference between them is that loadi16_anyext checked that the load was at least 2 byte aligned and non-volatile. But the alignment doesn't matter here. Just use extloadi32i16 for both. llvm-svn: 329154
* [X86] Use loadi16/loadi32 predicates in multiply patternsCraig Topper2018-04-041-9/+9
| | | | llvm-svn: 329153
* [X86] Remove more dead code left over from the handling of i8/i16 ↵Craig Topper2018-04-041-21/+0
| | | | | | UMUL_LOHI/SMUL_LOHI that is no longer needed. NFC llvm-svn: 329152
* [SCEV] Prove implications for SCEVUnknown PhisMax Kazantsev2018-04-041-0/+118
| | | | | | | | | | | | | | | | | This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis. If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values `L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient to prove the predicate for values that come from the same predecessor block. The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0` given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot. Differential Revision: https://reviews.llvm.org/D44001 Reviewed By: anna llvm-svn: 329150
* [X86] Remove dead code for handling i8/i16 UMUL_LOHI/SMUL_LOHI from ↵Craig Topper2018-04-041-12/+0
| | | | | | | | X86ISelDAGToDAG.cpp. NFC These are promoted to i16/i32 multiplies by a DAG combine. llvm-svn: 329147
* [X86] Remove some code that was only needed when i1 was a legal type. NFCCraig Topper2018-04-041-6/+0
| | | | llvm-svn: 329146
* [SimplifyCFG] Teach merge conditional stores to handle cases where the ↵Craig Topper2018-04-041-1/+16
| | | | | | | | | | | | | | | | | | | PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142
* Fix bad #include path in r329139Vlad Tsyrklevich2018-04-041-1/+1
| | | | llvm-svn: 329140
* Add the ShadowCallStack passVlad Tsyrklevich2018-04-044-0/+334
| | | | | | | | | | | | | | | | | | | | Summary: The ShadowCallStack pass instruments functions marked with the shadowcallstack attribute. The instrumented prolog saves the return address to [gs:offset] where offset is stored and updated in [gs:0]. The instrumented epilog loads/updates the return address from [gs:0] and checks that it matches the return address on the stack before returning. Reviewers: pcc, vitalybuka Reviewed By: pcc Subscribers: cryptoad, eugenis, craig.topper, mgorny, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44802 llvm-svn: 329139
* Minor no-op cmake file style fix.Nico Weber2018-04-041-1/+2
| | | | llvm-svn: 329137
* Reapply r329133 with fix.Lang Hames2018-04-041-5/+36
| | | | llvm-svn: 329136
* Revert r329133 "[RuntimeDyld][AArch64] Add some error pluming / generation..."Lang Hames2018-04-041-32/+4
| | | | | | This broke a number of buildbots. Looking in to it now... llvm-svn: 329135
* [MachineOutliner] Test for X86FI->getUsesRedZone() as well as ↵Jessica Paquette2018-04-031-1/+5
| | | | | | | | | | | | | | | | Attribute::NoRedZone This commit is similar to r329120, but uses the existing getUsesRedZone() function in X86MachineFunctionInfo. This teaches the outliner to look at whether or not a function *truly* uses a redzone instead of just the noredzone attribute on a function. Thus, after this commit, it's possible to outline from x86 without using -mno-red-zone and still get outlining results. This also adds a new test for the new redzone behaviour. llvm-svn: 329134
* [RuntimeDyld][AArch64] Add some error pluming / generation to catch unhandledLang Hames2018-04-031-4/+32
| | | | | | relocation types on AArch64. llvm-svn: 329133
* [AMDGPU] performMinMaxCombine should not optimize patterns of vectors to ↵Farhana Aleen2018-04-031-1/+1
| | | | | | | | | | | | | | | | min3/max3. Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3. Author: FarhanaAleen Reviewed By: arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D45219 llvm-svn: 329131
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-04-031-12/+12
| | | | | | Fix typo and simplify matching expression. llvm-svn: 329130
* [Hexagon] peel loops with runtime small trip countsIkhlas Ajbar2018-04-032-4/+2
| | | | | | | | Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129
* [InstCombine] allow more fmul folds with 'reassoc'Sanjay Patel2018-04-031-64/+64
| | | | | | | | The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121
* [MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfoJessica Paquette2018-04-033-7/+27
| | | | | | | | | | | | This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 llvm-svn: 329120
* Revert "MSG"Farhana Aleen2018-04-031-1/+1
| | | | | | | | This reverts commit 9a0ce889d1c39c74d69ecad5ce9c875155ae55de. This was committed by mistake. llvm-svn: 329119
OpenPOWER on IntegriCloud