summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Runtime flag to control branch funnel thresholdVitaly Buka2018-04-061-2/+6
| | | | | | | | | | Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45193 llvm-svn: 329459
* [NVPTX] Fixed vectorized LDG for f16.Artem Belevich2018-04-061-0/+6
| | | | | | | | | v2f16 is a special case in NVPTX. v4f16 may be loaded as a pair of v2f16 and that was not previously handled correctly by tryLDGLDU() Differential Revision: https://reviews.llvm.org/D45339 llvm-svn: 329456
* [RISCV] Tablegen-driven Instruction Compression.Sameer AbuAsal2018-04-069-5/+350
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements a tablegen-driven Instruction Compression mechanism for generating RISCV compressed instructions (C Extension) from the expanded instruction form. This tablegen backend processes CompressPat declarations in a td file and generates all the compile-time and runtime checks required to validate the declarations, validate the input operands and generate correct instructions. The checks include validating register operands, immediate operands, fixed register operands and fixed immediate operands. Example: class CompressPat<dag input, dag output> { dag Input = input; dag Output = output; list<Predicate> Predicates = []; } let Predicates = [HasStdExtC] in { def : CompressPat<(ADD GPRNoX0:$rs1, GPRNoX0:$rs1, GPRNoX0:$rs2), (C_ADD GPRNoX0:$rs1, GPRNoX0:$rs2)>; } The result is an auto-generated header file 'RISCVGenCompressEmitter.inc' which exports two functions for compressing/uncompressing MCInst instructions, plus some helper functions: bool compressInst(MCInst& OutInst, const MCInst &MI, const MCSubtargetInfo &STI, MCContext &Context); bool uncompressInst(MCInst& OutInst, const MCInst &MI, const MCRegisterInfo &MRI, const MCSubtargetInfo &STI); The clients that include this auto-generated header file and invoke these functions can compress an instruction before emitting it, in the target-specific ASM or ELF streamer, or can uncompress an instruction before printing it, when the expanded instruction format aliases is favored. The following clients were added to implement compression\uncompression for RISCV: 1) RISCVAsmParser::MatchAndEmitInstruction: Inserted a call to compressInst() to compresses instructions parsed by llvm-mc coming from an ASM input. 2) RISCVAsmPrinter::EmitInstruction: Inserted a call to compressInst() to compress instructions that were lowered from Machine Instructions (MachineInstr). 3) RVInstPrinter::printInst: Inserted a call to uncompressInst() to print the expanded version of the instruction instead of the compressed one (e.g, add s0, s0, a5 instead of c.add s0, a5) when -riscv-no-aliases is not passed. This patch squashes D45119, D42780 and D41932. It was reviewed in smaller patches by asb, efriedma, apazos and mgrang. Reviewers: asb, efriedma, apazos, llvm-commits, sabuasal Reviewed By: sabuasal Subscribers: mgorny, eraman, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng Differential Revision: https://reviews.llvm.org/D45385 llvm-svn: 329455
* [TableGen] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: stoklund, kparzysz, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45144 llvm-svn: 329451
* [StackProtector] Ignore certain intrinsics when calculating sspstrong heuristic.Matt Davis2018-04-061-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The 'strong' StackProtector heuristic takes into consideration call instructions. Certain intrinsics, such as lifetime.start, can cause the StackProtector to protect functions that do not need to be protected. Specifically, a volatile variable, (not optimized away), but belonging to a stack allocation will encourage a llvm.lifetime.start to be inserted during compilation. Because that intrinsic is a 'call' the strong StackProtector will see that the alloca'd variable is being passed to a call instruction, and insert a stack protector. In this case the intrinsic isn't really lowered to a call. This can cause unnecessary stack checking, at the cost of additional (wasted) CPU cycles. In the future we should rely on TargetTransformInfo::isLoweredToCall, but as of now that routine considers all intrinsics as not being lowerable. That needs to be corrected, and such a change is on my list of things to get moving on. As a side note, the updated stack-protector-dbginfo.ll test always seems to pass. I never see the dbg.declare/dbg.value reaching the StackProtector::HasAddressTaken, but I don't see any code excluding dbg intrinsic calls either, so I think it's the safest thing to do. Reviewers: void, timshen Reviewed By: timshen Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45331 llvm-svn: 329450
* [EarlyCSE] Add debug counter for debugging mis-optimizations. NFC.Geoff Berry2018-04-061-24/+60
| | | | | | | | | | Reviewers: reames, spatel, davide, dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D45162 llvm-svn: 329443
* [AMDGPU][MC][GFX9] Added s_call_b64Dmitry Preobrazhensky2018-04-061-0/+12
| | | | | | | | | See bug 36843: https://bugs.llvm.org/show_bug.cgi?id=36843 Differential Revision: https://reviews.llvm.org/D45268 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329440
* [Hexagon] Fix assert with packetizing IMPLICIT_DEF instructionsKrzysztof Parzyszek2018-04-061-1/+5
| | | | | | | | | | | | | | | | | | The compiler is generating packet with the following instructions, which causes an undefined register assert in the verifier. $r0 = IMPLICIT_DEF $r1 = IMPLICIT_DEF S2_storerd_io killed $r29, 0, killed %d0 The problem is that the packetizer is not saving the IMPLICIT_DEF instructions, which are needed when checking if it is legal to add the store instruction. The fix is to add the IMPLICIT_DEF instructions to the CurrentPacketMIs structure. Patch by Brendon Cahoon. llvm-svn: 329439
* [Hexagon] Prevent a stall across zero-latency instructions in a packetKrzysztof Parzyszek2018-04-061-15/+16
| | | | | | | | | | Packetizer keeps two zero-latency bound instrctions in the same packet ignoring the stalls on the later instruction. This should not be the case if there is no data dependence. Patch by Sumanth Gundapaneni. llvm-svn: 329437
* [Hexagon] Remove duplicated code, NFCKrzysztof Parzyszek2018-04-061-9/+0
| | | | llvm-svn: 329436
* [CodeGen] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-0620-76/+79
| | | | | | | | | | | | | | | | | | | | | | Summary: r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer the comments section in D44363 for a list of all the required patches. Reviewers: bogner, rnk, MatzeB, RKSimon Reviewed By: rnk Subscribers: JDevlieghere, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D45133 llvm-svn: 329435
* [Hexagon] Handle subregisters when calculating iteration count in HW loopsKrzysztof Parzyszek2018-04-061-0/+1
| | | | llvm-svn: 329434
* [AMDGPU][MC][GFX9] Added instruction s_endpgm_ordered_ps_doneDmitry Preobrazhensky2018-04-061-0/+7
| | | | | | | | | See bug 36844: https://bugs.llvm.org/show_bug.cgi?id=36844 Differential Revision: https://reviews.llvm.org/D45313 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329430
* [InstCombine] limit nsz: -(X - Y) --> Y - X to hasOneUse()Sanjay Patel2018-04-061-12/+9
| | | | | | | As noted in the post-commit discussion for r329350, we shouldn't generally assume that fsub is the same cost as fneg. llvm-svn: 329429
* [X686] Add appropriate ReadAfterLd for the register input to memory forms of ↵Craig Topper2018-04-065-32/+32
| | | | | | ADC/SBB. llvm-svn: 329424
* Strip trailing whitespace. NFCI.Simon Pilgrim2018-04-061-8/+8
| | | | llvm-svn: 329421
* [AMDGPU][MC][GFX9] Added instructions *saveexec*, *wrexec* and *bitreplicate*Dmitry Preobrazhensky2018-04-061-0/+21
| | | | | | | | | See bug 36840: https://bugs.llvm.org/show_bug.cgi?id=36840 Differential Revision: https://reviews.llvm.org/D45250 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329419
* [X86] Remove InstRWs for basic arithmetic instructions from Sandy Bridge ↵Craig Topper2018-04-061-64/+4
| | | | | | | | scheduler model. We can get this right through WriteALU and friends now. llvm-svn: 329417
* [X86] Attempt to model basic arithmetic instructions in the ↵Craig Topper2018-04-066-257/+35
| | | | | | | | | | | | | | | | | | | | | Haswell/Broadwell/Skylake scheduler models without InstRWs Summary: This patch removes InstRW overrides for basic arithmetic/logic instructions. To do this I've added the store address port to RMW. And used a WriteSequence to make the latency additive. It does not cover ADC/SBB because they have different latency. Apparently we were inconsistent about whether the store has latency or not thus the test changes. I've also left out Sandy Bridge because the load latency there is currently 4 cycles and should be 5. Reviewers: RKSimon, andreadb Reviewed By: andreadb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45351 llvm-svn: 329416
* [X86] Add an extra store address cycle to WriteRMW in the Sandy ↵Craig Topper2018-04-065-15/+15
| | | | | | | | Bridge/Broadwell/Haswell/Skylake scheduler model. Even those the address was calculated for the load, its calculated again for the store. llvm-svn: 329415
* [X86] Merge itineraries for CLC, CMC, and STC.Craig Topper2018-04-063-9/+5
| | | | | | These are very simple flag setting instructions that appear to only be a single uop. They're unlikely to need this separation. llvm-svn: 329414
* [GlobalOpt] Fix support for casts in ctors.Mircea Trofin2018-04-061-1/+5
| | | | | | | | | | | | | | | | Summary: Fixing an issue where initializations of globals where constructors use casts were silently translated to 0-initialization. Reviewers: davidxl, evgeny777 Reviewed By: evgeny777 Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45198 llvm-svn: 329409
* [AMDGPU][MC][VI][GFX9] Added s_atc_probe* instructionsDmitry Preobrazhensky2018-04-061-0/+28
| | | | | | | | | See bug 36839: https://bugs.llvm.org/show_bug.cgi?id=36839 Differential Revision: https://reviews.llvm.org/D45249 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329408
* [ARC] Add <.f> suffix for F32_GEN4_{DOP|SOP}.Pete Couperus2018-04-061-4/+32
| | | | | | | | | Add disassembler support for instructions which writeback STATUS32. https://reviews.llvm.org/D45148 Patch by Yan Luo! (Yan.Luo2@synopsys.com) llvm-svn: 329404
* [AMDGPU][MC][GFX9] Added s_dcache_discard* instructionsDmitry Preobrazhensky2018-04-061-0/+30
| | | | | | | | | See bug 36838: https://bugs.llvm.org/show_bug.cgi?id=36838 Differential Revision: https://reviews.llvm.org/D45247 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329397
* [LoopUnroll] Make LoopPeeling respect the AllowPeeling preference.Chad Rosier2018-04-061-10/+14
| | | | | | | | The SimpleLoopUnrollPass isn't suppose to perform loop peeling. Differential Revision: https://reviews.llvm.org/D45334 llvm-svn: 329395
* DWARFVerifier: validate information in name index entriesPavel Labath2018-04-062-3/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch add checks to verify that the information in the name index entries is consistent with the debug_info section. Specifically, we check that entries point to valid DIEs, and their names, tags, and compile units match the information in the debug_info sections. These checks are only run if the previous checks did not find any errors in the name index headers. Attempting to proceed with the checks anyway would likely produce a lot of spurious errors and the verification code would need to be very careful to avoid crashing. I also add a couple of more checks to the abbreviation-validation code to verify that some attributes are always present (an index without a DW_IDX_die_offset attribute is fairly useless). The entry verification works only on indexes without any type units - I haven't attempted to extend it to type units, as we don't even have a DWARF v5-compatible type unit generator at the moment. Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45323 llvm-svn: 329392
* [X86][SandyBridge] Add (V)DPPS memory fold latenciesSimon Pilgrim2018-04-061-0/+14
| | | | | | Noticed this during D44654 llvm-svn: 329389
* [X86][SandyBridge] SBWriteResPair +5cy Memory FoldsSimon Pilgrim2018-04-061-6/+6
| | | | | | | | | | | | As mentioned on D44647, this patch increases the default memory latency to +5cy , which more closely matches what most custom cases are doing for reg-mem instructions. I've bumped LoadLatency, ReadAfterLd and WriteLoad values to 5cy to be consistent. As Sandy Bridge is currently our default generic model, this affects a lot of scheduling tests... Differential Revision: https://reviews.llvm.org/D44654 llvm-svn: 329388
* Tweak an assert message in the verifierHans Wennborg2018-04-061-1/+1
| | | | llvm-svn: 329387
* [X86][SkylakeServer] Merge 2 InstRW entries to the same sched group. NFCI.Simon Pilgrim2018-04-061-2/+2
| | | | llvm-svn: 329386
* EntryExitInstrumenter: Handle musttail callsHans Wennborg2018-04-061-5/+15
| | | | | | | | Inserting instrumentation between a musttail call and ret instruction would create invalid IR. Instead, treat musttail calls as function exits. llvm-svn: 329385
* [NFC] Add missing end of line symbolsMax Kazantsev2018-04-061-3/+3
| | | | llvm-svn: 329383
* [MIR] Add support for MachineFrameInfo::LocalFrameSizeFrancis Visoiu Mistrih2018-04-062-0/+2
| | | | | | | | | | MFI.LocalFrameSize was not serialized. It is usually set from LocalStackSlotAllocation, so if that pass doesn't run it is impossible do deduce it from the stack objects. Until now, this information was lost. llvm-svn: 329382
* [debug_loc] Fix typo in DWARFExpression constructorPavel Labath2018-04-062-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The positions of the DwarfVersion and AddressSize arguments were reversed, which caused parsing for dwarf opcodes which contained address-size-dependent operands (such as DW_OP_addr). Amusingly enough, none of the address-size asserts fired, as dwarf version was always 4, which is a valid address size. I ran into this when constructing weird inputs for the DWARF verifier. I I add a test case as hand-written dwarf -- I am not sure how to trigger this differently, as having a DW_OP_addr inside a location list is a fairly non-standard thing to do. Fixing this error exposed a bug in the debug_loc.dwo parser, which was always being constructed with an address size of 0. I fix that as well by following the pattern in the non-dwo parser of picking up the address size from the first compile unit (which is technically not correct, but probably good enough in practice). Reviewers: JDevlieghere, aprantl, dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45324 llvm-svn: 329381
* [NFC] Loosen restriction on preheader to fix buildbotMax Kazantsev2018-04-061-5/+5
| | | | llvm-svn: 329379
* [PowerPC] allow D-form VSX load/store when accessing FrameIndex without offset Hiroshi Inoue2018-04-061-8/+16
| | | | | | | | | | VSX D-form load/store instructions of POWER9 require the offset be a multiple of 16 and a helper`isOffsetMultipleOf` is used to check this. So far, the helper handles FrameIndex + offset case, but not handling FrameIndex without offset case. Due to this, we are missing opportunities to exploit D-form instructions when accessing an object or array allocated on stack. For example, x-form store (stxvx) is used for int a[4] = {0}; instead of d-form store (stxv). For larger arrays, D-form instruction is not used when accessing the first 16-byte. Using D-form instructions reduces register pressure as well as instructions. Differential Revision: https://reviews.llvm.org/D45079 llvm-svn: 329377
* [LLVM-C] Audit Inline Assembly APIs for ConsistencyRobert Widmann2018-04-061-0/+32
| | | | | | | | | | | | | | | | | | Summary: - Add a missing getter for module-level inline assembly - Add a missing append function for module-level inline assembly - Deprecate LLVMSetModuleInlineAsm and replace it with LLVMSetModuleInlineAsm2 which takes an explicit length parameter - Deprecate LLVMConstInlineAsm and replace it with LLVMGetInlineAsm, a function that allows passing a dialect and is not mis-classified as a constant operation Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45346 llvm-svn: 329369
* Fix lld-x86_64-darwin13 build fails.Manoj Gupta2018-04-051-4/+4
| | | | | | | Use double braces in std::array initialization to keep Darwin builders happy. llvm-svn: 329363
* [InstCombine] FP: Z - (X - Y) --> Z + (Y - X)Sanjay Patel2018-04-051-2/+11
| | | | | | | | | | | | This restores what was lost with rL73243 but without re-introducing the bug that was present in the old code. Note that we already have these transforms if the ops are marked 'fast' (and I assume that's happening somewhere in the code added with rL170471), but we clearly don't need all of 'fast' for these transforms. llvm-svn: 329362
* Attempt to fix Mips breakages.Manoj Gupta2018-04-051-7/+8
| | | | | | | | | | | | | | Summary: Replace ArrayRefs by actual std::array objects so that there are no dangling references. Reviewers: rsmith, gkistanova Subscribers: sdardis, arichardson, llvm-commits Differential Revision: https://reviews.llvm.org/D45338 llvm-svn: 329359
* [X86] Separate CDQ and CDQE in the scheduler model.Craig Topper2018-04-055-20/+10
| | | | | | According to Agner's data, CDQE is closer to CWDE. llvm-svn: 329354
* [IR] Change std::sort to llvm::sort in response to r327219Mandeep Singh Grang2018-04-054-6/+6
| | | | | | | | | | | | | | r327219 added wrappers to std::sort which randomly shuffle the container before sorting. This will help in uncovering non-determinism caused due to undefined sorting order of objects having the same key. To make use of that infrastructure we need to invoke llvm::sort instead of std::sort. Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort. Refer D44363 for a list of all the required patches. llvm-svn: 329353
* [X86] Add MOVZPQILo2PQIrr to the Sandy Bridge scheduler modelCraig Topper2018-04-051-1/+1
| | | | llvm-svn: 329351
* [InstCombine] nsz: -(X - Y) --> Y - XSanjay Patel2018-04-051-4/+11
| | | | | | This restores part of the fold that was removed with rL73243 (PR4374). llvm-svn: 329350
* [X86] Add LEAVE instruction to the scheduler models using the same data as ↵Craig Topper2018-04-055-22/+16
| | | | | | | | | | LEAVE64. Make LEAVE/LEAVE64 more correct on Sandy Bridge. This is the 32-bit mode version of LEAVE64. It should be at least somewhat similar to LEAVE64. The Sandy Bridge version was missing a load port use. llvm-svn: 329347
* [DWARF v5][NFC]: Refactor DebugRnglists to prepare for the support of the ↵Wolfgang Pieb2018-04-052-99/+128
| | | | | | | | | | | | DW_AT_ranges attribute in conjunction with .debug_rnglists. Reviewers: JDevlieghere Differential Revision: https://reviews.llvm.org/D45307 llvm-svn: 329345
* AMDGPU/Metadata: Always report a fixed number of hidden argumentsKonstantin Zhuravlyov2018-04-051-8/+12
| | | | | | | | | | Currently it is 6. If the "feature" was not used, report dummy hidden argument. Otherwise it does not match the kernarg size reported in the kernel header. Differential Revision: https://reviews.llvm.org/D45129 llvm-svn: 329341
* [X86] Remove some InstRWs for plain store instructions on Sandy Bridge.Craig Topper2018-04-055-39/+5
| | | | | | We were forcing the latency of these instructions to 5 cycles, but every other scheduler model had them as 1 cycle. I'm sure I didn't get everything, but this gets a big portion. llvm-svn: 329339
* [RuntimeDyld][PowerPC] Use global entry points for calls between sections.Lang Hames2018-04-051-9/+13
| | | | | | | | | | | | | | | | | | Functions in different objects may use different TOCs, so calls between such functions should use the global entry point of the callee which updates the TOC pointer. This should fix a bug that the Numba developers encountered (see https://github.com/numba/numba/issues/2451). Patch by Olexa Bilaniuk. Thanks Olexa! No RuntimeDyld checker test case yet as I am not familiar enough with how RuntimeDyldELF fixes up call-sites, but I do not want to hold up landing this. I will continue to work on it and see if I can rope some powerpc experts in. llvm-svn: 329335
OpenPOWER on IntegriCloud