summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Redefine MOVSS/MOVSD instructions to take VR128 regclass as input ↵Craig Topper2017-10-043-77/+73
| | | | | | | | | | | | | | instead of FR32/FR64 This patch redefines the MOVSS/MOVSD instructions to take VR128 as its second input. This allows the MOVSS/SD->BLEND commute to work without requiring a COPY to be inserted. This should fix PR33079 Overall this looks to be an improvement in the generated code. I haven't checked the EXPENSIVE_CHECKS build but I'll do that and update with results. Differential Revision: https://reviews.llvm.org/D38449 llvm-svn: 314914
* bpf: fix an insn encoding issue for neg insnYonghong Song2017-10-041-2/+0
| | | | | Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 314911
* [OptRemark] Move YAML writing to IRAdam Nemet2017-10-044-92/+88
| | | | | | | | | | | | Before the patch this was in Analysis. Moving it to IR and making it implicit part of LLVMContext::diagnose allows the full opt-remark facility to be used outside passes e.g. the pass manager. Jessica is planning to use this to report function size after each pass. The same could be used for time reports. Tested with BUILD_SHARED_LIBS=On. llvm-svn: 314909
* Also update MachineORE after r314874.Adam Nemet2017-10-041-4/+2
| | | | llvm-svn: 314908
* [NFC] clang-format lib/Transforms/Scalar/MergeICmps.cppClement Courbet2017-10-041-39/+21
| | | | llvm-svn: 314906
* [X86][SSE] Early out from ComputeNumSignBitsForTargetNode. NFCI.Simon Pilgrim2017-10-041-2/+6
| | | | | | Early out from vector shift by immediates that will exceed eltsize - don't bother making an unnecessary ComputeNumSignBits recursive call. llvm-svn: 314903
* [X86][SSE] Add support for lowering unary shuffles to PACKSS/PACKUSSimon Pilgrim2017-10-041-12/+24
| | | | | | Extension to D38472 llvm-svn: 314901
* [AVR] Implement LPMWRdZ pseudo-instruction's expansion.Dylan McKay2017-10-041-1/+44
| | | | | | | | | FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should refactor a bit and unify the two Patch by Gerdo Erdi. llvm-svn: 314898
* [AVR] Factor out mayLoad in tablegen patternsDylan McKay2017-10-041-2/+2
| | | | | | Patch by Gergo Erdi. llvm-svn: 314897
* [AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`Dylan McKay2017-10-042-7/+7
| | | | | | Patch by Gergo Erdi. llvm-svn: 314896
* [AVR] Insert JMP for long branchesDylan McKay2017-10-042-2/+22
| | | | | | | | | | | Previously, on long branches (relative jumps of >4 kB), an assertion failure was hit, as AVRInstrInfo::insertIndirectBranch was not implemented. Despite its name, it is called by the branch relaxator for *all* unconditional jumps. Patch by Thomas Backman. llvm-svn: 314891
* [AVR] Fix displacement overflow for LDDW/STDWDylan McKay2017-10-042-5/+13
| | | | | | | | | | | | | | | | | | | In some cases, the code generator attempts to generate instructions such as: lddw r24, Y+63 which expands to: ldd r24, Y+63 ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary This commit limits the first offset to 62, and thus the second to 63. It also updates some asserts in AVRExpandPseudoInsts.cpp, including for INW and OUTW, which appear to be unused. Patch by Thomas Backman. llvm-svn: 314890
* [ARM] Add diag string for movw/movt immediates in assemblyOliver Stannard2017-10-041-0/+1
| | | | | | | | | This adds diagnostics for invalid immediate operands to the MOVW and MOVT instructions (ARM and Thumb). Differential revision: https://reviews.llvm.org/D31879 llvm-svn: 314888
* [ARM, Asm] Change grammar of immediate operand diagnosticsOliver Stannard2017-10-041-3/+3
| | | | | | | | | | | | | | | | | Currently, our diagnostics for assembly operands are not consistent. Some start with (for example) "immediate operand must be ...", and some with "operand must be an immediate ...". I think the latter form is preferable for a few reasons: * It's unambiguous that it is referring to the expected type of operand, not the type the user provided. For example, the user could provide an register operand, and get a message taking about an operand is if it is already an immediate, just not in the accepted range. * It allows us to have a consistent style once we add diagnostics for operands that could take two forms, for example a label or pc-relative memory operand. Differential revision: https://reviews.llvm.org/D36689 llvm-svn: 314887
* [X86] Improvement in CodeGen instruction selection for LEAs (re-applying ↵Jatin Bhateja2017-10-043-10/+506
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | post required revision changes.) Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy Reviewed By: lsaba Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314886
* [MC] - Don't assert when non-english characters are used.George Rimar2017-10-041-12/+13
| | | | | | | | | | | | | | | | | | I found that llvm-mc does not like non-english characters even in comments, which it tries to tokenize. Problem happens because of functions like isdigit(), isalnum() which takes int argument and expects it is not negative. But at the same time MCParser uses char* to store input buffer poiner, char has signed value, so it is possible to pass negative value to one of functions from above and that triggers an assert. Testcase for demonstration is provided. To fix the issue helper functions were introduced in StringExtras.h Differential revision: https://reviews.llvm.org/D38461 llvm-svn: 314883
* Recommit [UnreachableBlockElim] Use COPY if PHI input is undefMikael Holmen2017-10-041-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This time invoking llc with "-march=x86-64" in the testcase, so we don't assume the default target is x86. Summary: If we have %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3 %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0 then we can't just change %vreg0 into %vreg3, since %vreg2 is actually undef. We would have to also copy the undef flag to be able to change the register. Instead we deal with this case like other cases where we can't just replace the register: we insert a COPY. The code creating the COPY already copied all flags from the PHI input, so the undef flag will be transferred as it should. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38235 llvm-svn: 314882
* [IRCE] Temporarily disable unsigned latch conditions by defaultMax Kazantsev2017-10-041-0/+21
| | | | | | | | | | | | We have found some corner cases connected to range intersection where IRCE makes a bad thing when the latch condition is unsigned. The fix for that will go as a follow up. This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed. The unsigned latch conditions were introduced to IRCE by rL310027. Differential Revision: https://reviews.llvm.org/D38529 llvm-svn: 314881
* Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef"Mikael Holmen2017-10-041-3/+2
| | | | | | Build-bots broke on the new testcase. I'll investigate and fix. llvm-svn: 314880
* [UnreachableBlockElim] Use COPY if PHI input is undefMikael Holmen2017-10-041-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If we have %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3 %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0 then we can't just change %vreg0 into %vreg3, since %vreg2 is actually undef. We would have to also copy the undef flag to be able to change the register. Instead we deal with this case like other cases where we can't just replace the register: we insert a COPY. The code creating the COPY already copied all flags from the PHI input, so the undef flag will be transferred as it should. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38235 llvm-svn: 314879
* [X86] Fix using the SJLJ jump table on x86_64Martin Storsjo2017-10-041-10/+65
| | | | | | | | | | | | | | | | The previous version didn't work if the jump table base address didn't fit in 32 bit, since it was encoded as an immediate offset. And in case the jump table is encoded as 32 bit label differences, we need to load and add them to the table base first. This solves the first half of the issues mentioned in PR34720. Also fix some of the errors pointed out by -verify-machineinstrs, by using GR32_NOSPRegClass. Differential Revision: https://reviews.llvm.org/D38333 llvm-svn: 314876
* Move verbosity check for remarks to the diag handlerAdam Nemet2017-10-042-5/+6
| | | | | | | | Test needs some slight adjustment because we no longer check the existence of BFI but rather that the actual hotness is set on the remark. If entry_count is not set getBlockProfileCount returns None. llvm-svn: 314874
* [FuzzerUtil] Partially revert D38481 on FuzzerUtilTim Shen2017-10-041-1/+9
| | | | | | | | | This is because lib/Fuzzer doesn't really depend on llvm infrastucture. It's not easy to access the llvm hardware_concurrency here. Differential Reivision: https://reviews.llvm.org/D38481 llvm-svn: 314870
* Simplify multikey_qsort function.Rui Ueyama2017-10-031-24/+20
| | | | | | | This function implements the three-way radix quicksort algorithm. This patch simplifies the implementation by using MutableArrayRef. llvm-svn: 314858
* [AArch64] Use LateSimplifyCFG after expanding atomic operations.Balaram Makam2017-10-031-1/+1
| | | | | | | | | | | | | | | | | Summary: After r308422 we defer optimizations that can destroy loop canonical forms to LateSimplifyCFG. Running LateSimplifyCFG after expanding atomic operations can exploit more control-flow opportunities. Reviewers: mcrosier, t.p.northover, efriedma Reviewed By: efriedma Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D38262 llvm-svn: 314857
* AMDGPU: Expand setcc for v2f32 and v4f32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314853
* AMDGPU: Expand setcc for v2i32 and v4i32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314852
* AMDGPU: Add ELFOSABI_AMDGPU_MESA3DKonstantin Zhuravlyov2017-10-031-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D38387 llvm-svn: 314846
* [X86] Remove dead declaration convertArgMovsToPushes, NFCReid Kleckner2017-10-031-9/+0
| | | | | | | | This was dead when it landed in r252578. We have this functionality, if not for stack probe calls, but for regular calls in X86CallFrameOptimization.cpp. llvm-svn: 314845
* Pre-compute the tail of the archiveRafael Espindola2017-10-031-177/+184
| | | | | | | | | | | | | | | | | | | | | | | | An archive looks like <header> <symbol table> <tail> The symbol table refers to offsets in the tail. A complication is that we would like to support symbol tables that use 64 bit offsets if it turns out that any of the offsets is too big. This patch changes the archive writer to first compute the tail. We cannot just compute one big StringRef since that would require reading every member upfront, but we can represent it as a series of StringRefs. Having done that it is much easier to compute the symbol table and all offsets are computed before it is written. With this if there is an accounting problem it will show up with a regular symbol table, not just when a 64 bit one is needed. llvm-svn: 314844
* AMDGPU: Add ELFOSABI_AMDGPU_PALKonstantin Zhuravlyov2017-10-031-2/+3
| | | | llvm-svn: 314843
* Refactor DIBuilder dbg intrinsic insertion, NFCReid Kleckner2017-10-031-58/+57
| | | | | | | Both dbg.declare and dbg.value insertion had duplicate code for the two overloads with different insertion point conventions. llvm-svn: 314839
* [MachineOutliner] Fix off-by-one in cost modelJessica Paquette2017-10-031-35/+36
| | | | | | | | | | This commit does two things. Firstly, it cleans up some of the benefit calculation wrt outlined functions and candidates. Secondly, it fixes an off-by-one bug in the cost model which was caused by the benefit value of an OutlinedFunction and Candidate differing by 1. It updates the remarks test to reflect this change. llvm-svn: 314836
* [PowerPC] Revert P9 scheduling model to incompleteStefan Pintilie2017-10-031-1/+1
| | | | | | | Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026 The previous change caused regressions on Power 9. llvm-svn: 314835
* [InstCombine] Use isSignBitCheck to simplify an if statement. Directly ↵Craig Topper2017-10-031-12/+8
| | | | | | | | create new sign bit compares instead of manipulating the constant. NFCI Since we no longer had the direct constant compares, manipulating the constant seemeded less clear. llvm-svn: 314830
* [AMDGPU] implemented pal metadataTim Renouf2017-10-036-3/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For the amdpal OS type: We write an AMDGPU_PAL_METADATA record in the .note section in the ELF (or as an assembler directive). It contains key=value pairs of 32 bit ints. It is a merge of metadata from codegen of the shaders, and metadata provided by the frontend as _amdgpu_pal_metadata IR metadata. Where both sources have a key=value with the same key, the two values are ORed together. This .note record is part of the amdpal ABI and will be documented in docs/AMDGPUUsage.rst in a future commit. Eventually the amdpal OS type will stop generating the .AMDGPU.config section once the frontend has safely moved over to using the .note records above instead of .AMDGPU.config. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37753 llvm-svn: 314829
* [AMDGPU] Avoid predicated execution of the basic blocks containing scalarAlexander Timofeev2017-10-031-0/+10
| | | | | | | | instructions. Differential revision: https://reviews.llvm.org/D38293 llvm-svn: 314828
* Fix -Wcovered-switch-default warnings from r314821Hans Wennborg2017-10-031-3/+2
| | | | llvm-svn: 314826
* Revert r314817 "[dwarfdump] Add -lookup option"Hans Wennborg2017-10-032-30/+2
| | | | | | | | | | | | | | | | | The test fails on Linux; see follow-up email on the llvm-commits list. > Add the option to lookup an address in the debug information and print > out the file, function, block and line table details. > > Differential revision: https://reviews.llvm.org/D38409 This also reverts the follow-up r314818: > [test] Fix llvm-dwarfdump/cmdline.test > > Fixes test/tools/llvm-dwarfdump/cmdline.test llvm-svn: 314825
* Revert r314806 "[SLP] Vectorize jumbled memory loads."Hans Wennborg2017-10-032-256/+84
| | | | | | | | | | | | | | | | | | | | | | | | | All the buildbots are red, e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/ > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' of > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh > > Reviewed By: Ayal > > Subscribers: hans, mzolotukhin > > Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314824
* Implement David Blaikie's suggestion for comparison operatorsReid Kleckner2017-10-031-3/+8
| | | | llvm-svn: 314822
* CodeView: Provide a .def file with the register idsHans Wennborg2017-10-033-143/+130
| | | | | | | | | | | | | | The list of register ids was previously written out in a couple of dirrent places. This puts it in a .def file and also adds a few more registers (e.g. the x87 regs) which should lead to more readable dumps, but I didn't include the whole list since that seems unnecessary. X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not relying on magic constants anymore. The TODO of using tablegen still stands. Differential revision: https://reviews.llvm.org/D38480 llvm-svn: 314821
* [DebugInfo] Correctly coalesce DBG_VALUEs that mix direct and indirect valuesReid Kleckner2017-10-031-83/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This should fix a regression introduced by r313786, which switched from MachineInstr::isIndirectDebugValue() to checking if operand 1 is an immediate. I didn't have a test case for it until now. A single UserValue, which approximates a user variable, may have many DBG_VALUE instructions that disagree about whether the variable is in memory or in a virtual register. This will become much more common once we have llvm.dbg.addr, but you can construct such a test case manually today with llvm.dbg.value. Before this change, we would get two UserValues: one for direct and one for indirect DBG_VALUE instructions describing the same variable. If we build separate interval maps for direct and indirect locations, we will end up accidentally coalescing identical DBG_VALUE intervals that need to remain separate because they are broken up by intervals of the opposite direct-ness. Reviewers: aprantl Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D37932 llvm-svn: 314819
* [dwarfdump] Add -lookup optionJonas Devlieghere2017-10-032-2/+30
| | | | | | | | | Add the option to lookup an address in the debug information and print out the file, function, block and line table details. Differential revision: https://reviews.llvm.org/D38409 llvm-svn: 314817
* Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source ↵Geoff Berry2017-10-033-636/+23
| | | | | | | | | | forwarding"" This reverts commit r314729. Another bug has been encountered in an out-of-tree target reported by Quentin. llvm-svn: 314814
* Use sched_getaffinity instead of std::thread::hardware_concurrency.Rafael Espindola2017-10-034-13/+20
| | | | | | | | | | | | | | The issue with std::thread::hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration. With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores. This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example. llvm-svn: 314809
* Revert the change that accidentally went in r314806.Dehao Chen2017-10-031-4/+0
| | | | llvm-svn: 314807
* [SLP] Vectorize jumbled memory loads.Mohammad Shahid2017-10-032-84/+256
| | | | | | | | | | | | | | | | | | | | | | Summary: This patch tries to vectorize loads of consecutive memory accesses, accessed in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 which was reverted back due to some basic issue with representing the 'use mask' of jumbled accesses. This patch fixes the mask representation by recording the 'use mask' in the usertree entry. Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh Reviewed By: Ayal Subscribers: hans, mzolotukhin Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314806
* [ARM] Use table-gen'd assembly operand diags in ARM asm parserOliver Stannard2017-10-033-95/+22
| | | | | | | | | | | | This switches the ARM AsmParser to use assembly operand diagnostics from tablegen, rather than a switch statement on the ARMMatchResultTy. It moves the existing diagnostic strings to tablegen, but adds no new ones, so this is NFC except for one diagnostic string that had an off-by-1 error in the hand-written switch statement. Differential revision: https://reviews.llvm.org/D31607 llvm-svn: 314804
* [ARM, Asm] Use correct source location for register tokensOliver Stannard2017-10-031-3/+3
| | | | | | | | | | | | tryParseRegister advances the lexer, so we need to take copies of the start and end locations of the register operand before calling it. Previously, the caret in the diagnostic pointer to the comma after the r0 operand in the test, rather than the start of the operand. Differential revision: https://reviews.llvm.org/D31537 llvm-svn: 314799
OpenPOWER on IntegriCloud