summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [x86] fix typo in comment; NFCSanjay Patel2018-01-261-1/+1
| | | | llvm-svn: 323545
* [X86][AVX] LowerBUILD_VECTORAsVariablePermute - add support for VPERMILPV to ↵Simon Pilgrim2018-01-261-0/+7
| | | | | | | | v4i32/v4f32 Extension to D42431, adding support for v4i32/v4f32 as well as v2i64/v2f64 now that D42308 has landed llvm-svn: 323542
* [X86][SSE] Don't colaesce v4i32 extractsSimon Pilgrim2018-01-261-96/+1
| | | | | | | | | | We currently coalesce v4i32 extracts from all 4 elements to 2 v2i64 extracts + shifts/sign-extends. This seems to have been added back in the days when we tended to spill vectors and reload scalars, or ended up with repeated shuffles moving everything down to 0'th index. I don't think either of these are likely these days as we have better EXTRACT_VECTOR_ELT and VECTOR_SHUFFLE handling, and the existing code tends to make it very difficult for various vector and load combines. Differential Revision: https://reviews.llvm.org/D42308 llvm-svn: 323541
* [X86][SSE] Drop PMADDWD in lowerMulSimon Pilgrim2018-01-261-7/+0
| | | | | | As mentioned in D42258, we don't need this any more llvm-svn: 323540
* [DAG] Teach findBaseOffset to interpret indexes of indexed memory operationsNirav Dave2018-01-261-8/+35
| | | | | | Indexed outputs are addition / subtractions and can be interpreted as such. llvm-svn: 323539
* [AMDGPU][MC] Added validation of image dst/data size (must match dmask and tfe)Dmitry Preobrazhensky2018-01-261-0/+61
| | | | | | | | | See bug 36000: https://bugs.llvm.org/show_bug.cgi?id=36000 Differential Revision: https://reviews.llvm.org/D42483 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 323538
* [MIPS] Don't crash on unsized extern types with -mgpoptAlexander Richardson2018-01-261-0/+7
| | | | | | | | | | | | | | Summary: This fixes an assertion when building the FreeBSD MIPS64 kernel. Reviewers: atanasyan, sdardis, emaste Reviewed By: sdardis Subscribers: krytarowski, llvm-commits Differential Revision: https://reviews.llvm.org/D42571 llvm-svn: 323536
* [DAGCombine] reduceBuildVecToShuffle - ensure EXTRACT_VECTOR_ELT index is in ↵Simon Pilgrim2018-01-261-1/+5
| | | | | | | | range From OSS Fuzz Test Case #5688 llvm-svn: 323535
* [AMDGPU][MC] Added support of 64-bit image atomicsDmitry Preobrazhensky2018-01-265-17/+115
| | | | | | | | | See bug 35998: https://bugs.llvm.org/show_bug.cgi?id=35998 Differential Revision: https://reviews.llvm.org/D42469 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 323534
* [SLP] Removed the warning about unused variable, NFC.Alexey Bataev2018-01-261-1/+1
| | | | llvm-svn: 323533
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-261-131/+369
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323530
* [AMDGPU][MC] Enabled disassembler for image atomic operationsDmitry Preobrazhensky2018-01-261-12/+16
| | | | | | | | | See bug 35988: https://bugs.llvm.org/show_bug.cgi?id=35988 Differential Revision: https://reviews.llvm.org/D42186 Reviewers: vpykhtin, artem.tamazov, arsenm llvm-svn: 323527
* [X86] Cleanup SDLoc arguments as mentioned on D42544Simon Pilgrim2018-01-261-6/+7
| | | | llvm-svn: 323526
* [MIR] Add support for addrspace in MIRFrancis Visoiu Mistrih2018-01-264-0/+20
| | | | | | | | | | Add support for printing / parsing the addrspace of a MachineMemOperand. Fixes PR35970. Differential Revision: https://reviews.llvm.org/D42502 llvm-svn: 323521
* [AMDGPU] fix LDS f32 intrinsicsDaniil Fukalov2018-01-263-22/+25
| | | | | | | | | | | | - using qualified pointer addrspace in intrinsics class to avoid .f32 mangling - changed too common atomic mangling to ds - added missing intrinsics to AMDGPUTTIImpl::getTgtMemIntrinsic Reviewed by: b-sumner Differential Revision: https://reviews.llvm.org/D42383 llvm-svn: 323516
* [CallSiteSplitting] Fix infinite loop when recording conditions.Florian Hahn2018-01-261-1/+2
| | | | | | | | | Fix infinite loop when recording conditions by correctly marking basic blocks as visited. Fixes https://bugs.llvm.org/show_bug.cgi?id=36105 llvm-svn: 323515
* [ARM] Accept a subset of Thumb GPR register class when emitting an SP-relativeMomchil Velikov2018-01-261-2/+2
| | | | | | | | | | | | | load instruction The function `Thumb1InstrInfo::loadRegFromStackSlot` accepts only the `tGPR` register class. The function serves to emit a `tLDRspi` instruction and certainly any subset of the `tGPR` register class is a valid destination of the load. Differential revision: https://reviews.llvm.org/D42535 llvm-svn: 323514
* [ARM] Armv8.2-A FP16 code generation (part 1/3)Sjoerd Meijer2018-01-269-28/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the groundwork for Armv8.2-A FP16 code generation . Clang passes and returns _Float16 values as floats, together with the required bitconverts and truncs etc. to implement correct AAPCS behaviour, see D42318. We will implement half-precision argument passing/returning lowering in the ARM backend soon, but for now this means that this: _Float16 sub(_Float16 a, _Float16 b) { return a + b; } gets lowered to this: define float @sub(float %a.coerce, float %b.coerce) { entry: %0 = bitcast float %a.coerce to i32 %tmp.0.extract.trunc = trunc i32 %0 to i16 %1 = bitcast i16 %tmp.0.extract.trunc to half <SNIP> %add = fadd half %1, %3 <SNIP> } When FullFP16 is *not* supported, we don't make f16 a legal type, and we get legalization for "free", i.e. nothing changes and everything works as before. And also f16 argument passing/returning is handled. When FullFP16 is supported, we do make f16 a legal type, and have 2 places that we need to patch up: f16 argument passing and returning, which involves minor tweaks to avoid unnecessary code generation for some bitcasts. As a "demonstrator" that this works for the different FP16, FullFP16, softfp modes, etc., I've added match rules to the VSUB instruction description showing that we can codegen this instruction from IR, but more importantly, also to some conversion instructions. These conversions were causing issue before in the FP16 and FullFP16 cases. I've also added match rules to the VLDRH and VSTRH desriptions, so that we can actually compile the entire half-precision sub code example above. This showed that these loads and stores had the wrong addressing mode specified: AddrMode5 instead of AddrMode5FP16, which turned out not be implemented at all, so that has also been added. This is the minimal patch that shows all the different moving parts. In patch 2/3 I will add some efficient lowering of bitcasts, and in 2/3 I will add the remaining Armv8.2-A FP16 instruction descriptions. Thanks to Sam Parker and Oliver Stannard for their help and reviews! Differential Revision: https://reviews.llvm.org/D38315 llvm-svn: 323512
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-2613-14/+14
| | | | | | "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508
* [RISCV] Encode RISCV specific ELF e_flags to RISCV Binary by RISCVTargetStreamerShiva Chen2018-01-266-0/+117
| | | | llvm-svn: 323507
* [X86] Remove dead code from LowerBUILD_VECTOR that tried to handle i64 ↵Craig Topper2018-01-261-21/+0
| | | | | | | | element type in 32-bit mode. Type legalization would prevent any i64 operands to the build_vector from existing before we get here. The coverage bots show this code as uncovered. llvm-svn: 323506
* [SelectionDAG] Replace a std::vector<SDValue> with a SmallVector.Craig Topper2018-01-261-1/+1
| | | | | | It likely the number of elements in the type we're legalizing here is reasonably small. llvm-svn: 323505
* [X86] Remove code from combineBitcastvxi1 that was needed to support the ↵Craig Topper2018-01-261-47/+0
| | | | | | | | previous native IR for kunpck intrinsics. The original autoupgrade for kunpck intrinsics used a bitcasted scalar shift, or, and. This combine would turn this into a concat_vectors. Now the kunpck intrinsics are autoupgraded to a vector shuffle that will become a concat_vectors. llvm-svn: 323504
* [X86] Remove unused intrinsic type handling. NFCCraig Topper2018-01-262-28/+2
| | | | llvm-svn: 323503
* [X86] Simplify condition in VSETCC. NFCCraig Topper2018-01-261-2/+1
| | | | | | This listed all legal 128-bit integer types individually, but since we already know we have a legal type and its integer, we can just check is128BitVector. llvm-svn: 323502
* [X86] Remove LowerVSETCC code for handling vXi1 setcc with vXi8/vXi16 input ↵Craig Topper2018-01-261-6/+3
| | | | | | | | type. NFC These kinds of setccs are promoted by a DAG combine before they ever get to legalization. llvm-svn: 323501
* [X86] Remove some dead code from LowerVSETCC. NFCCraig Topper2018-01-261-13/+0
| | | | | | This code was added in r321967, but ultimately I fixed the issue in the legalizer and this code was no longer required. llvm-svn: 323500
* [CGP] Re-enable Select in complex addressing mode.Serguei Katkov2018-01-261-1/+1
| | | | | | Switch Select handling on after fixing two bugs: rL323192 and rL323497. llvm-svn: 323498
* [X86] Fix killed flag handling in X86FixupLea passSerguei Katkov2018-01-261-1/+2
| | | | | | | | | | | | | | | | When pass creates a MOV instruction for lea (%base,%index,1), %dst => mov %base,%dst; add %index,%dst modification it should clean the killed flag for base if base is equal to index. Otherwise verifier complains about usage of killed register in add instruction. Reviewers: lsaba, zvi, zansari, aaboud Reviewed By: lsaba Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42522 llvm-svn: 323497
* [CodeGen] Ignore private symbols in llvm.used for COFFShoaib Meenai2018-01-261-4/+4
| | | | | | | Similar to the existing handling for internal symbols, private symbols are also not visible to the linker and should be ignored. llvm-svn: 323483
* [Debug] LCSSA: Insert dbg.value at the first available insertion pointVedant Kumar2018-01-251-1/+3
| | | | | | | | | | | Inserting a dbg.value instruction at the start of a basic block with a landingpad instruction triggers a verifier failure. We should be OK if we insert the instruction a bit later. Speculative fix for the bot failure described here: https://reviews.llvm.org/D42551 llvm-svn: 323482
* [DWARFv5] Classify all the new forms. NFC.Paul Robinson2018-01-251-14/+25
| | | | | | | | | | Move standard forms from a switch statement to the table of forms; fill in all the missing ones defined in DWARF v5. I'm guessing at classifications in a couple of cases where v5 forms aren't actually supported yet, but whoever adds support for the forms can fix the classifications as needed. llvm-svn: 323481
* [DWARFv5] Support DW_FORM_line_strp in llvm-dwarfdump.Paul Robinson2018-01-254-13/+33
| | | | | | | | | | | This form is like DW_FORM_strp, but points to .debug_line_str instead of .debug_str as the string section. It's intended to be used from the line-table header, and allows string-pooling of directory and filenames across compilation units. Differential Revision: https://reviews.llvm.org/D42553 llvm-svn: 323476
* [SyntheticCounts] Rewrite the code using only graph traits.Easwaran Raman2018-01-252-80/+82
| | | | | | | | | | | | | | | | | | | Summary: The intent of this is to allow the code to be used with ThinLTO. In Thinlink phase, a traditional Callgraph can not be computed even though all the necessary information (nodes and edges of a call graph) is available. This is due to the fact that CallGraph class is closely tied to the IR. This patch first extends GraphTraits to add a CallGraphTraits graph. This is then used to implement a version of counts propagation on a generic callgraph. Reviewers: davidxl Subscribers: mehdi_amini, tejohnson, llvm-commits Differential Revision: https://reviews.llvm.org/D42311 llvm-svn: 323475
* [AArch64] Enable aggressive FMA on T99 and provide AArch64 options for others.Joel Jones2018-01-254-0/+16
| | | | | | | | | | | | | | This patch enables aggressive FMA by default on T99, and provides a -mllvm option to enable the same on other AArch64 micro-arch's (-mllvm -aarch64-enable-aggressive-fma). Test case demonstrating the effects on T99 is included. Patch by: steleman (Stefan Teleman) Differential Revision: https://reviews.llvm.org/D40696 llvm-svn: 323474
* [Debug] Add dbg.value intrinsics for PHIs created during LCSSA.Vedant Kumar2018-01-251-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Patch by Matt Davis! Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 323472
* [Debug] Add a utility to propagate dbg.value to new PHIs, NFCVedant Kumar2018-01-252-33/+39
| | | | | | | | | | This simply moves an existing utility to Utils for reuse. Split out of: https://reviews.llvm.org/D42551 Patch by Matt Davis! llvm-svn: 323471
* [asan] Fix kernel callback naming in instrumentation module.Evgeniy Stepanov2018-01-251-3/+1
| | | | | | | | | | Right now clang uses "_n" suffix for some user space callbacks and "N" for the matching kernel ones. There's no need for this and it actually breaks kernel build with inline instrumentation. Use the same callback names for user space and the kernel (and also make them consistent with the names GCC uses). Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D42423 llvm-svn: 323470
* [X86] Teach Intel syntax InstPrinter to print lock prefixes that have been ↵Craig Topper2018-01-251-2/+2
| | | | | | | | parsed from the asm parser. The asm parser puts the lock prefix in the MCInst flags so we need to check that in addition to TSFlags. This matches what the ATT printer does. llvm-svn: 323469
* [X86] Combine two unnecessarily complicated ifs that had the same body. NFCCraig Topper2018-01-251-3/+1
| | | | llvm-svn: 323468
* Re-land "[ThinLTO] Add call edges' relative block frequency to per-module ↵Easwaran Raman2018-01-254-21/+59
| | | | | | | | | | | | | | summary." It was reverted after buildbot regressions. Original commit message: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. llvm-svn: 323460
* [Hexagon] SETEQ and SETNE are valid integer condition codesKrzysztof Parzyszek2018-01-251-1/+2
| | | | llvm-svn: 323452
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-251-350/+129
| | | | | | | | as shuffle." This reverts commit r323441 to fix buildbots. llvm-svn: 323447
* [LTO] - Introduce GlobalResolution::Prevailing flag.George Rimar2018-01-251-15/+9
| | | | | | | | | It is NFC refactoring change that will make D42107 a bit smaller. Differential revision: https://reviews.llvm.org/D42528 llvm-svn: 323444
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-251-129/+350
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323441
* [X86] Apply clang-format to detectUSatPattern. NFCI.Simon Pilgrim2018-01-251-5/+4
| | | | | | Cleanup from D42544 llvm-svn: 323439
* Revert "[Hexagon] Replace EmitFunctionEntryCode with a DAG preprocessing code"Krzysztof Parzyszek2018-01-252-22/+16
| | | | | | This reverts r323374. The fix needs a different approach. llvm-svn: 323438
* [InstCombine] narrow masked zexted binops (PR35792)Sanjay Patel2018-01-252-0/+71
| | | | | | | | | | | | | | | | | This is guarded by shouldChangeType(), so the tests show that we don't do the fold if the narrower type is not legal. Note that there is a proposal (D42424) that would change the results for the specific cases shown in these tests. That difference is also discussed in PR35792: https://bugs.llvm.org/show_bug.cgi?id=35792 Alive proofs for the cases handled here as well as the bitwise logic binops that we should already do better on: https://rise4fun.com/Alive/c97 https://rise4fun.com/Alive/Lc5E https://rise4fun.com/Alive/kdf llvm-svn: 323437
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-251-352/+130
| | | | | | | | as shuffle." This reverts commit r323430 to fix buildbots. llvm-svn: 323432
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-251-130/+352
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323430
OpenPOWER on IntegriCloud