summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [Hexagon] Add support for "new" circular buffer intrinsicsKrzysztof Parzyszek2018-03-285-91/+280
| | | | | | | | | | | | | | | | | | | These instructions have been around for a long time, but we haven't supported intrinsics for them. The "new" versions use the CSx register for the start of the buffer instead of the K field in the Mx register. We need to use pseudo instructions for these instructions until after register allocation. The problem is that these instructions allocate a M0/CS0 or M1/CS1 pair. But, we can't generate code for the CSx set-up until after register allocation when the Mx register has been fixed for the instruction. There is a related clang patch. Patch by Brendon Cahoon. llvm-svn: 328724
* Oops - moved slightly too many things from Scalar to Utils. Move ↵David Blaikie2018-03-282-4/+4
| | | | | | LoopSimplifyCFG things back llvm-svn: 328720
* [MachineOutliner] Simplify call outlining + require valid callee save info ↵Jessica Paquette2018-03-281-31/+18
| | | | | | | | | | for call outlining This commit simplifies the call outlining logic by removing references to the Function associated with the callee. To do this, it requires that valid callee save info is available to the outliner. llvm-svn: 328719
* Transforms: Introduce Transforms/Utils.h rather than spreading the ↵David Blaikie2018-03-2837-29/+50
| | | | | | | | | declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717
* [llvm-ar] Support multiple dashed optionsPeter Collingbourne2018-03-281-9/+13
| | | | | | | | | | | | | | | This allows syntax like: $ llvm-ar -c -r -u file.a file.o This is in addition to the other formats that are already supported: $ llvm-ar cru file.a file.o $ llvm-ar -cru file.a file.o Patch by Tom Anderson! Differential Revision: https://reviews.llvm.org/D44452 llvm-svn: 328716
* [AMDGPU][MC] Added ds_add_src2_f32Dmitry Preobrazhensky2018-03-281-0/+3
| | | | | | | | | See bug 36833: https://bugs.llvm.org/show_bug.cgi?id=36833 Differential Revision: https://reviews.llvm.org/D44779 Reviewers: arsenm, artem.tamazov, timcorringham llvm-svn: 328713
* [AMDGPU][MC] Added PCK variants of image load/store instructionsDmitry Preobrazhensky2018-03-281-26/+40
| | | | | | | | | See bug 36834: https://bugs.llvm.org/show_bug.cgi?id=36834 Differential Revision: https://reviews.llvm.org/D44795 Reviewers: artem.tamazov, arsenm, timcorringham, nhaehnle llvm-svn: 328710
* [AMDGPU][MC][GFX9] Added buffer_*_format_d16_hi_xDmitry Preobrazhensky2018-03-281-0/+10
| | | | | | | | | See bug 36835: https://bugs.llvm.org/show_bug.cgi?id=36835 Differential Revision: https://reviews.llvm.org/D44825 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328707
* [AMDGPU][MC][GFX9] Added s_scratch* instructionsDmitry Preobrazhensky2018-03-281-0/+15
| | | | | | | | | See bug 36836: https://bugs.llvm.org/show_bug.cgi?id=36836 Differential Revision: https://reviews.llvm.org/D44832 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328704
* [X86][Btver2] Moved JWriteFCmp/JWriteFCmpY classes next to each other. NFCISimon Pilgrim2018-03-281-14/+14
| | | | | | Renamed JWriteFPAY22 to JWriteFCmpY - we've tended to avoid latency based names llvm-svn: 328701
* Revert "Reapply "[DWARFv5] Emit file 0 to the line table.""Alexander Potapenko2018-03-287-124/+71
| | | | | | | | | | | | | | | This reverts commit r328676. Commit r328676 broke the -no-integrated-as flag necessary to build Linux kernel with Clang: $ cat t.c void foo() {} $ clang -no-integrated-as -c t.c -g /tmp/t-dcdec5.s: Assembler messages: /tmp/t-dcdec5.s:8: Error: file number less than one clang-7.0: error: assembler command failed with exit code 1 (use -v to see invocation) llvm-svn: 328699
* [X86][BtVer2] Fix the number of micro opcodes for AES[ENC|DEC] and other YMM ↵Andrea Di Biagio2018-03-281-1/+4
| | | | | | | | | | | instructions. Similar to r328694. The number of micro opcodes should be 2 for those instructions. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328698
* [MSan] Introduce ActualFnStart. NFCAlexander Potapenko2018-03-281-8/+10
| | | | | | | | | | | | | This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697
* Revert "[AMDGPU] For OS type AMDPAL, fixed scratch on compute shader"Tim Renouf2018-03-281-4/+2
| | | | | | | | | | | This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981. It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a release-with-asserts build. I will resubmit the change when I have fixed that. Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a llvm-svn: 328695
* [X86][BtVer2] Fix the number of micro opcodes for a bunch of YMM instructions.Andrea Di Biagio2018-03-281-0/+12
| | | | | | | | | | | | | The Jaguar backend natively supports 128-bit data types. Operations on YMM registers are split into two COPs (complex operations). Each COP consumes a slot in the dispatch group, and in the reorder buffer. The scheduling model for Jaguar should mark those instructions as `let NumMicroOps = 2`. This was found when testing AVX code for BtVer2 using llvm-mca. llvm-svn: 328694
* [MSan] Add an isStore argument to getShadowOriginPtr(). NFCAlexander Potapenko2018-03-281-38/+47
| | | | | | | | | | | | | | | | This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692
* [ARM] Support float literals under XOChristof Douma2018-03-283-3/+6
| | | | | | | | | | Follow up patch of r328313 to support the UseVMOVSR constraint. Removed some unneeded instructions from the test and removed some stray comments. Differential Revision: https://reviews.llvm.org/D44941 llvm-svn: 328691
* [RegisterCoalescing] Don't move COPY if it would interfere with another valueMikael Holmen2018-03-281-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: RegisterCoalescer::removePartialRedundancy tries to hoist B = A from BB0/BB2 to BB1: BB1: ... BB0/BB2: ---- B = A; | ... | A = B; | |------- | It does so if a number of conditions are fulfilled. However, it failed to check if B was used by any of the terminators in BB1. Since we must insert B = A before the terminators (since it's not a terminator itself), this means that we could erroneously insert a new definition of B before a use of it. Reviewers: wmi, qcolombet Reviewed By: wmi Subscribers: MatzeB, llvm-commits, sdardis Differential Revision: https://reviews.llvm.org/D44918 llvm-svn: 328689
* [ORC] Fix ORC on platforms without indirection support.Lang Hames2018-03-281-1/+5
| | | | | | | | | | Previously this crashed because a nullptr (returned by createLocalIndirectStubsManagerBuilder() on platforms without indirection support) functor was unconditionally invoked. Patch by Andres Freund. Thanks Andres! llvm-svn: 328687
* AMDGPU: Really implement getFrameRegisterMatt Arsenault2018-03-271-1/+2
| | | | | | | Currently this seems to only really be used for debug info. llvm-svn: 328677
* Reapply "[DWARFv5] Emit file 0 to the line table."Paul Robinson2018-03-277-71/+124
| | | | | | | | | | | | | | | | DWARF v5 specifies that the root file (also given in the DW_AT_name attribute of the compilation unit DIE) should be emitted explicitly to the line table's list of files. This makes the line table more independent of the .debug_info section. Fixes the bug found by asan. Also XFAIL the new test for Darwin, which is stuck on DWARF v2, and fix up other tests so they stop failing on Windows. Last but not least, don't break "clang -g" of an assembler file that has .file directives in it. Differential Revision: https://reviews.llvm.org/D44054 llvm-svn: 328676
* [MachineOutliner] AArch64: Don't outline ADRPs with un-outlinable operandsJessica Paquette2018-03-271-11/+7
| | | | | | | | | If an ADRP appears with, say, a CPI operand, we shouldn't outline it. This moves the check for unsafe operands so that it occurs before the special-case for ADRPs. Also add a test for outlining ADRPs. llvm-svn: 328674
* [AMDGPU] For OS type AMDPAL, fixed scratch on compute shaderTim Renouf2018-03-271-2/+4
| | | | | | | | | | | | | | | | | | Summary: For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders). This commit fixes that to use offset 0x10 instead of offset 0 for a compute shader, per the PAL ABI spec. Reviewers: kzhuravl, nhaehnle, timcorringham Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm Differential Revision: https://reviews.llvm.org/D44468 Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f llvm-svn: 328673
* [DWARF] Suppress split line tables more carefully.Paul Robinson2018-03-275-20/+19
| | | | | | | | | | | | | | | | If a given split type unit does not have source locations, don't have it refer to the split line table. If no split type unit refers to the split line table, don't emit the line table at all. This will save a little space on rare occasions, but also refactors things a bit to improve which class is responsible for what. Responding to review comments on r326395. Differential Revision: https://reviews.llvm.org/D44220 llvm-svn: 328670
* [CodeGen] Fixed unreachable with -print-machineinstrs and custom pseudo ↵Tim Renouf2018-03-271-1/+6
| | | | | | | | | | | | | | | | | | source value Summary: Rev 327580 "[CodeGen] Use MIR syntax for MachineMemOperand printing" broke -print-machineinstrs for us on AMDGPU, because we have custom pseudo source values, and MIR serialization does not implement that. This commit at least restores the functionality of -print-machineinstrs, even if it does not properly implement the missing MIR serialization functionality. Differential Revision: https://reviews.llvm.org/D44871 Change-Id: I44961c0b90bf6d48c01484ed7a4e466fd300db66 llvm-svn: 328668
* Initialize variable added in r328617.Sterling Augustine2018-03-271-0/+1
| | | | llvm-svn: 328667
* [X86] Add WriteFMOVMSK/WriteVecMOVMSK/WriteMMXMOVMSK scheduler classesSimon Pilgrim2018-03-2711-60/+57
| | | | | | | | Currently MOVMSK instructions use the WriteVecLogic class, which is a very poor choice given that MOVMSK involves a SSE->GPR transfer. Differential Revision: https://reviews.llvm.org/D44924 llvm-svn: 328664
* [DWARF][DWARF v5]: Adding support for dumping DW_RLE_offset_pair and ↵Wolfgang Pieb2018-03-271-19/+48
| | | | | | | | | | DW_RLE_base_address Reviewers: dblakie, aprantl Differential Revision: https://reviews.llvm.org/D44811 llvm-svn: 328662
* [YAML] Escape non-printable multibyte UTF8 in Output::scalarString.Graydon Hoare2018-03-272-31/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing YAML Output::scalarString code path includes a partial and incorrect implementation of YAML escaping logic. In particular, the logic put in place in rL321283 escapes non-printable bytes only if they are not part of a multibyte UTF8 sequence; implicitly this means that all multibyte UTF8 sequences -- printable and non -- are passed through verbatim. The simplest solution to this is to direct the Output::scalarString method to use the standalone yaml::escape function, and this _almost_ works, except that the existing code in that function _over_ escapes: any multibyte UTF8 sequence is escaped, even printable ones. While this is permitted for YAML, it is also more aggressive (and hard to read for non-English locales) than necessary, and the entire point of rL321283 was to back off such aggressive over-escaping. So in this change, I have both redirected Output::scalarString to use yaml::escape _and_ modified yaml::escape to optionally restrict its escaping to non-printables. This preserves behaviour of any existing clients while giving them a path to more moderate escaping should they desire. Reviewers: JDevlieghere, thegameg, MatzeB, vladimir.plyashkun Reviewed By: thegameg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44863 llvm-svn: 328661
* 80-line wrap. NFCXin Tong2018-03-271-1/+2
| | | | llvm-svn: 328660
* AMDGPU: Fix not preserving CSR VGPR if used for SGPR spillsMatt Arsenault2018-03-271-4/+3
| | | | | | | | Before this was not done if the function had no calls in it. This is still a possible issue with any callable function, regardless of calls present. llvm-svn: 328659
* AMDGPU: Set natural stack alignment in DataLayoutMatt Arsenault2018-03-271-2/+2
| | | | | | | Only 4 byte alignment is ever useful, so increasing anything beyond this may require realigning the stack. llvm-svn: 328656
* [PGO] Fix branch probability remarks assertRong Xu2018-03-271-7/+9
| | | | | | | | | Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653
* AMDGPU: Fix crash when MachinePointerInfo invalidMatt Arsenault2018-03-271-1/+1
| | | | | | | | The combine on a select of a load only triggers for addrspace 0, and discards the MachinePointerInfo. The conservative default needs to be used for this. llvm-svn: 328652
* AMDGPU: Fix FP restore from being reordered with stack opsMatt Arsenault2018-03-271-1/+6
| | | | | | | | | | | | | | | | | In a function, s5 is used as the frame base SGPR. If a function is calling another function, during the call sequence it is copied to a preserved SGPR and restored. Before it was possible for the scheduler to move stack operations before the restore of s5, since there's nothing to associate a frame index access with the restore. Add an implicit use of s5 to the adjcallstack pseudo which ends the call sequence to preven this from happening. I'm not 100% satisfied with this solution, but I'm not sure what else would be better. llvm-svn: 328650
* [Hexagon] Implement TTI::shouldMaximizeVectorBandwidthKrzysztof Parzyszek2018-03-271-0/+1
| | | | llvm-svn: 328648
* [Power9] Fix the resource list for the COPY instruction.Stefan Pintilie2018-03-271-1/+1
| | | | | | | The COPY instruction was listed as a 4 cycle instruction. It is now listed correctly as a 2 cycle ALU instruction. llvm-svn: 328647
* Remap values in PromotedFloatsPirama Arumuga Nainar2018-03-271-0/+6
| | | | | | | | | | | | | | | | Summary: When a node is about to be erased from ReplacedValues, we should also remap its corresponding values in PromotedFloats. Patch by Yan Luo (Yan.Luo2@synopsys.com) Reviewers: pirama Reviewed By: pirama Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D44872 llvm-svn: 328644
* [Hexagon] Rudimentary support for auto-vectorization for HVXKrzysztof Parzyszek2018-03-272-12/+155
| | | | | | | | This implements a set of TTI functions that the loop vectorizer uses. The only purpose of this is to enable testing. Auto-vectorization is disabled by default, enabled by -hexagon-autohvx. llvm-svn: 328639
* [AArch64] Decorate AArch64 instrs with OPERAND_PCRELRafael Auler2018-03-272-1/+27
| | | | | | | | | | | | | | Summary: This is a canonical way to teach objdump to print the target symbols for branches when disassembling AArch64 code. Reviewers: evandro, t.p.northover, espindola Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D44851 llvm-svn: 328638
* [NFC] OptPassGate extracted from OptBisectFedor Sergeev2018-03-278-26/+35
| | | | | | | | | | | | | | | Summary: This is an NFC refactoring of the OptBisect class to split it into an optional pass gate interface used by LLVMContext and the Optional Pass Bisector (OptBisect) used for debugging of optional passes. This refactoring is needed for D44464, which introduces setOptPassGate() method to allow implementations other than OptBisect. Patch by Yevgeny Rouban. Reviewers: andrew.w.kaylor, fedor.sergeev, vsk, dberlin, Eugene.Zelenko, reames, skatkov Reviewed By: fedor.sergeev Differential Revision: https://reviews.llvm.org/D44821 llvm-svn: 328637
* Use .set instead of = when printing assignment in assembly outputKrzysztof Parzyszek2018-03-271-1/+2
| | | | | | | | | On Hexagon "x = y" is a syntax used in most instructions, and is not treated as a directive. Differential Revision: https://reviews.llvm.org/D44256 llvm-svn: 328635
* [LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per targetKrzysztof Parzyszek2018-03-272-1/+6
| | | | | | | | The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632
* [X86][Btver2] Add MMX_PMOVMSKBrr to MOVMSK scheduler classSimon Pilgrim2018-03-271-1/+1
| | | | llvm-svn: 328620
* [PowerPC] Secure PLT supportStrahinja Petrovic2018-03-275-26/+91
| | | | | | | | This patch supports secure PLT mode for PowerPC 32 architecture. Differential Revision: https://reviews.llvm.org/D42112 llvm-svn: 328617
* [MIPS] Add static_assert that all Fixups are handled in getFixupKindAlexander Richardson2018-03-271-2/+7
| | | | | | | | | | | | | | | | | | Summary: I recently added a new Fixup kind to our fork of LLVM but forgot to add it to the table in MipsAsmBackend.cpp. With this static_assert the error would have been caught instead of zero-initializing the array entries for the new fixups. Reviewers: sdardis, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44895 llvm-svn: 328616
* [LoopUnroll][NFC] Remove redundant canPeel checkMax Kazantsev2018-03-271-2/+2
| | | | | | | | | | | | We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615
* [IRCE] Enable decreasing loops of non-const boundSam Parker2018-03-271-52/+74
| | | | | | | | | | | | | | As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613
* [NFC] Fix comments in getExact()Max Kazantsev2018-03-271-12/+11
| | | | llvm-svn: 328612
* [SCEV] Make exact taken count calculation more optimisticMax Kazantsev2018-03-271-6/+16
| | | | | | | | | | | | | | | | | | | | | Currently, `getExact` fails if it sees two exit counts in different blocks. There is no solid reason to do so, given that we only calculate exact non-taken count for exiting blocks that dominate latch. Using this fact, we can simply take min out of all exits of all blocks to get the exact taken count. This patch makes the calculation more optimistic with enforcing our assumption with asserts. It allows us to calculate exact backedge taken count in trivial loops like for (int i = 0; i < 100; i++) { if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44676 Reviewed By: fhahn llvm-svn: 328611
OpenPOWER on IntegriCloud