summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r314928 to investigate thinLTO bootstrap failureXinliang David Li2017-10-052-234/+0
| | | | llvm-svn: 314961
* [X86] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-10-058-127/+198
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 314953
* AMDGPU: Add comment about clampsMatt Arsenault2017-10-051-0/+2
| | | | llvm-svn: 314952
* AMDGPU: Do not fold clamp instructions when sources are differentMatt Arsenault2017-10-051-0/+1
| | | | | | Patch by hakzsam (Samuel Pitoiset) llvm-svn: 314951
* [InstCombine] Improve support for ashr in foldICmpAndShiftCraig Topper2017-10-041-9/+12
| | | | | | | | We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users. Differential Revision: https://reviews.llvm.org/D38521 llvm-svn: 314945
* AMDGPU: Fix not accounting for instruction size in bundlesMatt Arsenault2017-10-042-1/+15
| | | | | | | These were counted as 0. Fixes branch limit exceeded errors in some large programs. llvm-svn: 314944
* AMDGPU: Correctly set EI_OSABI based on the osKonstantin Zhuravlyov2017-10-043-7/+24
| | | | | | Differential Revision: https://reviews.llvm.org/D38555 llvm-svn: 314943
* clang-format file.Adrian Prantl2017-10-041-30/+27
| | | | llvm-svn: 314942
* delete commented out code.Adrian Prantl2017-10-041-2/+0
| | | | llvm-svn: 314941
* Do not call Loop::getName on possibly dead loopsSanjoy Das2017-10-041-2/+4
| | | | | | This fixes PR34832. llvm-svn: 314938
* [MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new ↵Xin Tong2017-10-041-0/+10
| | | | | | | | | | | | | | loop is processed Summary: Rotate on exit that actually exits the current loop. Reviewers: davidxl, danielcdh, iteratee, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38563 llvm-svn: 314937
* Fix a -Wparentheses warning. NFC.Hans Wennborg2017-10-041-1/+1
| | | | llvm-svn: 314936
* [LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFCMarcello Maggioni2017-10-042-134/+124
| | | | llvm-svn: 314934
* Bring r314809 back.Rafael Espindola2017-10-043-4/+19
| | | | | | | | | | | | | | | | | | | | | But now include a check for CPU_COUNT so we still build on 10 year old versions of glibc. Original message: Use sched_getaffinity instead of std::thread::hardware_concurrency. The issue with std::thread::hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration. With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores. This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example. llvm-svn: 314931
* [SimplifyCFG] put the optional assumption cache pointer in the options ↵Sanjay Patel2017-10-044-54/+47
| | | | | | | | | | | | struct; NFCI This is a follow-up to https://reviews.llvm.org/D38138. I fixed the capitalization of some functions because we're changing those lines anyway and that helped verify that we weren't accidentally dropping any options by using default param values. llvm-svn: 314930
* Recommit r314561 after fixing msan build failureXinliang David Li2017-10-042-0/+234
| | | | | | | (trial 2) Incoming val defined by terminator instruction which also requires bitcasts can not be handled. llvm-svn: 314928
* Recommit : Use the basic cost if a GEP is not used as addressing modeJun Bum Lim2017-10-043-2/+7
| | | | | | | | | | | | | | Recommitting r314517 with the fix for handling ConstantExpr. Original commit message: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. llvm-svn: 314923
* Revert D38481 due to missing cmake check for CPU_COUNTDaniel Neilson2017-10-043-19/+4
| | | | | | | | | | | | | Summary: This reverts D38481. The change breaks systems with older versions of glibc. It injects a use of CPU_COUNT() from sched.h without checking to ensure that the function exists first. Reviewers: Subscribers: llvm-svn: 314922
* [X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for v8i64/v8f64 512-bit ↵Simon Pilgrim2017-10-041-6/+5
| | | | | | | | vector compare results. AVX1/AVX2 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts llvm-svn: 314921
* [Hexagon] Add a member Subtarget to HexagonInstrInfo, NFCKrzysztof Parzyszek2017-10-042-51/+25
| | | | llvm-svn: 314920
* Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs ↵Hans Wennborg2017-10-043-506/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (re-applying post required revision changes.)" It broke the Chromium / SQLite build; see PR34830. > Summary: > 1/ Operand folding during complex pattern matching for LEAs has been > extended, such that it promotes Scale to accommodate similar operand > appearing in the DAG. > e.g. > T1 = A + B > T2 = T1 + 10 > T3 = T2 + A > For above DAG rooted at T3, X86AddressMode will no look like > Base = B , Index = A , Scale = 2 , Disp = 10 > > 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs > so that if there is an opportunity then complex LEAs (having 3 operands) > could be factored out. > e.g. > leal 1(%rax,%rcx,1), %rdx > leal 1(%rax,%rcx,2), %rcx > will be factored as following > leal 1(%rax,%rcx,1), %rdx > leal (%rdx,%rcx) , %edx > > 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, > thus avoiding creation of any complex LEAs within a loop. > > Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy > > Reviewed By: lsaba > > Subscribers: jmolloy, spatel, igorb, llvm-commits > > Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314919
* [X86][SSE] Add support for lowering v8i16 binary shuffles to PACKSS/PACKUSSimon Pilgrim2017-10-041-0/+5
| | | | | | Missed in D38472 llvm-svn: 314916
* [X86] Redefine MOVSS/MOVSD instructions to take VR128 regclass as input ↵Craig Topper2017-10-043-77/+73
| | | | | | | | | | | | | | instead of FR32/FR64 This patch redefines the MOVSS/MOVSD instructions to take VR128 as its second input. This allows the MOVSS/SD->BLEND commute to work without requiring a COPY to be inserted. This should fix PR33079 Overall this looks to be an improvement in the generated code. I haven't checked the EXPENSIVE_CHECKS build but I'll do that and update with results. Differential Revision: https://reviews.llvm.org/D38449 llvm-svn: 314914
* bpf: fix an insn encoding issue for neg insnYonghong Song2017-10-041-2/+0
| | | | | Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 314911
* [OptRemark] Move YAML writing to IRAdam Nemet2017-10-044-92/+88
| | | | | | | | | | | | Before the patch this was in Analysis. Moving it to IR and making it implicit part of LLVMContext::diagnose allows the full opt-remark facility to be used outside passes e.g. the pass manager. Jessica is planning to use this to report function size after each pass. The same could be used for time reports. Tested with BUILD_SHARED_LIBS=On. llvm-svn: 314909
* Also update MachineORE after r314874.Adam Nemet2017-10-041-4/+2
| | | | llvm-svn: 314908
* [NFC] clang-format lib/Transforms/Scalar/MergeICmps.cppClement Courbet2017-10-041-39/+21
| | | | llvm-svn: 314906
* [X86][SSE] Early out from ComputeNumSignBitsForTargetNode. NFCI.Simon Pilgrim2017-10-041-2/+6
| | | | | | Early out from vector shift by immediates that will exceed eltsize - don't bother making an unnecessary ComputeNumSignBits recursive call. llvm-svn: 314903
* [X86][SSE] Add support for lowering unary shuffles to PACKSS/PACKUSSimon Pilgrim2017-10-041-12/+24
| | | | | | Extension to D38472 llvm-svn: 314901
* [AVR] Implement LPMWRdZ pseudo-instruction's expansion.Dylan McKay2017-10-041-1/+44
| | | | | | | | | FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should refactor a bit and unify the two Patch by Gerdo Erdi. llvm-svn: 314898
* [AVR] Factor out mayLoad in tablegen patternsDylan McKay2017-10-041-2/+2
| | | | | | Patch by Gergo Erdi. llvm-svn: 314897
* [AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`Dylan McKay2017-10-042-7/+7
| | | | | | Patch by Gergo Erdi. llvm-svn: 314896
* [AVR] Insert JMP for long branchesDylan McKay2017-10-042-2/+22
| | | | | | | | | | | Previously, on long branches (relative jumps of >4 kB), an assertion failure was hit, as AVRInstrInfo::insertIndirectBranch was not implemented. Despite its name, it is called by the branch relaxator for *all* unconditional jumps. Patch by Thomas Backman. llvm-svn: 314891
* [AVR] Fix displacement overflow for LDDW/STDWDylan McKay2017-10-042-5/+13
| | | | | | | | | | | | | | | | | | | In some cases, the code generator attempts to generate instructions such as: lddw r24, Y+63 which expands to: ldd r24, Y+63 ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary This commit limits the first offset to 62, and thus the second to 63. It also updates some asserts in AVRExpandPseudoInsts.cpp, including for INW and OUTW, which appear to be unused. Patch by Thomas Backman. llvm-svn: 314890
* [ARM] Add diag string for movw/movt immediates in assemblyOliver Stannard2017-10-041-0/+1
| | | | | | | | | This adds diagnostics for invalid immediate operands to the MOVW and MOVT instructions (ARM and Thumb). Differential revision: https://reviews.llvm.org/D31879 llvm-svn: 314888
* [ARM, Asm] Change grammar of immediate operand diagnosticsOliver Stannard2017-10-041-3/+3
| | | | | | | | | | | | | | | | | Currently, our diagnostics for assembly operands are not consistent. Some start with (for example) "immediate operand must be ...", and some with "operand must be an immediate ...". I think the latter form is preferable for a few reasons: * It's unambiguous that it is referring to the expected type of operand, not the type the user provided. For example, the user could provide an register operand, and get a message taking about an operand is if it is already an immediate, just not in the accepted range. * It allows us to have a consistent style once we add diagnostics for operands that could take two forms, for example a label or pc-relative memory operand. Differential revision: https://reviews.llvm.org/D36689 llvm-svn: 314887
* [X86] Improvement in CodeGen instruction selection for LEAs (re-applying ↵Jatin Bhateja2017-10-043-10/+506
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | post required revision changes.) Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy Reviewed By: lsaba Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314886
* [MC] - Don't assert when non-english characters are used.George Rimar2017-10-041-12/+13
| | | | | | | | | | | | | | | | | | I found that llvm-mc does not like non-english characters even in comments, which it tries to tokenize. Problem happens because of functions like isdigit(), isalnum() which takes int argument and expects it is not negative. But at the same time MCParser uses char* to store input buffer poiner, char has signed value, so it is possible to pass negative value to one of functions from above and that triggers an assert. Testcase for demonstration is provided. To fix the issue helper functions were introduced in StringExtras.h Differential revision: https://reviews.llvm.org/D38461 llvm-svn: 314883
* Recommit [UnreachableBlockElim] Use COPY if PHI input is undefMikael Holmen2017-10-041-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This time invoking llc with "-march=x86-64" in the testcase, so we don't assume the default target is x86. Summary: If we have %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3 %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0 then we can't just change %vreg0 into %vreg3, since %vreg2 is actually undef. We would have to also copy the undef flag to be able to change the register. Instead we deal with this case like other cases where we can't just replace the register: we insert a COPY. The code creating the COPY already copied all flags from the PHI input, so the undef flag will be transferred as it should. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38235 llvm-svn: 314882
* [IRCE] Temporarily disable unsigned latch conditions by defaultMax Kazantsev2017-10-041-0/+21
| | | | | | | | | | | | We have found some corner cases connected to range intersection where IRCE makes a bad thing when the latch condition is unsigned. The fix for that will go as a follow up. This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed. The unsigned latch conditions were introduced to IRCE by rL310027. Differential Revision: https://reviews.llvm.org/D38529 llvm-svn: 314881
* Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef"Mikael Holmen2017-10-041-3/+2
| | | | | | Build-bots broke on the new testcase. I'll investigate and fix. llvm-svn: 314880
* [UnreachableBlockElim] Use COPY if PHI input is undefMikael Holmen2017-10-041-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If we have %vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3 %vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0 then we can't just change %vreg0 into %vreg3, since %vreg2 is actually undef. We would have to also copy the undef flag to be able to change the register. Instead we deal with this case like other cases where we can't just replace the register: we insert a COPY. The code creating the COPY already copied all flags from the PHI input, so the undef flag will be transferred as it should. Reviewers: kparzysz Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38235 llvm-svn: 314879
* [X86] Fix using the SJLJ jump table on x86_64Martin Storsjo2017-10-041-10/+65
| | | | | | | | | | | | | | | | The previous version didn't work if the jump table base address didn't fit in 32 bit, since it was encoded as an immediate offset. And in case the jump table is encoded as 32 bit label differences, we need to load and add them to the table base first. This solves the first half of the issues mentioned in PR34720. Also fix some of the errors pointed out by -verify-machineinstrs, by using GR32_NOSPRegClass. Differential Revision: https://reviews.llvm.org/D38333 llvm-svn: 314876
* Move verbosity check for remarks to the diag handlerAdam Nemet2017-10-042-5/+6
| | | | | | | | Test needs some slight adjustment because we no longer check the existence of BFI but rather that the actual hotness is set on the remark. If entry_count is not set getBlockProfileCount returns None. llvm-svn: 314874
* [FuzzerUtil] Partially revert D38481 on FuzzerUtilTim Shen2017-10-041-1/+9
| | | | | | | | | This is because lib/Fuzzer doesn't really depend on llvm infrastucture. It's not easy to access the llvm hardware_concurrency here. Differential Reivision: https://reviews.llvm.org/D38481 llvm-svn: 314870
* Simplify multikey_qsort function.Rui Ueyama2017-10-031-24/+20
| | | | | | | This function implements the three-way radix quicksort algorithm. This patch simplifies the implementation by using MutableArrayRef. llvm-svn: 314858
* [AArch64] Use LateSimplifyCFG after expanding atomic operations.Balaram Makam2017-10-031-1/+1
| | | | | | | | | | | | | | | | | Summary: After r308422 we defer optimizations that can destroy loop canonical forms to LateSimplifyCFG. Running LateSimplifyCFG after expanding atomic operations can exploit more control-flow opportunities. Reviewers: mcrosier, t.p.northover, efriedma Reviewed By: efriedma Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D38262 llvm-svn: 314857
* AMDGPU: Expand setcc for v2f32 and v4f32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314853
* AMDGPU: Expand setcc for v2i32 and v4i32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314852
* AMDGPU: Add ELFOSABI_AMDGPU_MESA3DKonstantin Zhuravlyov2017-10-031-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D38387 llvm-svn: 314846
OpenPOWER on IntegriCloud