summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][BtVer2] Update the WriteLoad latency.Andrea Di Biagio2019-01-211-3/+2
| | | | | | | | | | | | | | | | | | r327630 introduced new write definitions for float/vector loads. Before that revision, WriteLoad was used by both integer/float (scalar/vector) load. So, WriteLoad had to conservatively declare a latency to 5cy. That is because the load-to-use latency for float/vector load is 5cy. Now that we have dedicated writes for float/vector loads, there is no reason why we should keep the latency of WriteLoad to 5cy. At the moment, WriteLoad is only used by scalar integer loads only; we can assume an optimstic 3cy latency for them. This patch changes that latency from 5cy to 3cy, and regenerates the affected scheduling/mca tests. Differential Revision: https://reviews.llvm.org/D56922 llvm-svn: 351742
* [X86] Remove and autoupgrade vpmovqd/vpmovwb intrinsics using trunc+select.Craig Topper2019-01-212-8/+12
| | | | llvm-svn: 351729
* [SCEV][NFC] Introduces expression sizes estimationMax Kazantsev2019-01-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the field `ExpressionSize` in SCEV. This field is calculated only once on SCEV creation, and it represents the complexity of this SCEV from arithmetical point of view (not from the point of the number of actual different SCEV nodes that are used in the expression). Roughly saying, it is the number of operands and operations symbols when we print this SCEV. A formal definition is following: if SCEV `X` has operands `Op1`, `Op2`, ..., `OpN`, then Size(X) = 1 + Size(Op1) + Size(Op2) + ... + Size(OpN). Size of SCEVConstant and SCEVUnknown is one. Expression size may be used as a universal way to limit SCEV transformations for huge SCEVs. Currently, we have a bunch of options that represents various limits (such as recursion depth limit) that may not make any sense from the point of view of a LLVM users who is not familiar with SCEV internals, and all these different options pursue one goal. A more general rule that may potentially allow us to get rid of this redundancy in options is "do not make transformations with SCEVs of huge size". It can apply to all SCEV traversals and transformations that may need to visit a SCEV node more than once, hence they are prone to combinatorial explosions. This patch only introduces SCEV sizes calculation as NFC, its utilization will be introduced in follow-up patches. Differential Revision: https://reviews.llvm.org/D35989 Reviewed By: reames llvm-svn: 351725
* [RISCV] Add R_RISCV_RELAX relocation to all possible relax candidates.Kito Cheng2019-01-211-7/+15
| | | | | | | | | | | | Summary: Add R_RISCV_RELAX relocation to all possible relax candidates and update corresponding testcase. Reviewers: asb, apazos Differential Revision: https://reviews.llvm.org/D46677 llvm-svn: 351723
* [AVR] Insert unconditional branch when inserting MBBs between blocks with ↵Dylan McKay2019-01-211-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | fallthrough This updates the AVR Select8/Select16 expansion code so that, when inserting the two basic blocks for true and false conditions, any existing fallthrough on the previous block is preserved. Prior to this patch, if the block before the Select pseudo fell through to the subsequent block, two new basic blocks would be inserted at the prior fallthrough point, changing the fallthrough destination. The predecessor or successor lists were not updated, causing the BranchFolding pass at -O1 and above the rearrange basic blocks, causing an infinite loop. Not to mention the unconditional fallthrough to the true block is incorrect in of itself. This patch modifies the Select8/16 expansion so that, if inserting true and false basic blocks at a fallthrough point, the implicit branch is preserved by means of an explicit, unconditional branch to the previous fallthrough destination. Thanks to Carl Peto for reporting this bug. This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123. llvm-svn: 351721
* [AVR] Enable emission of debug informationDylan McKay2019-01-211-0/+1
| | | | | | | | | | | | | | Prior to this, the code was missing AVR-specific relocation logic in RelocVisitor.h. This patch teaches RelocVisitor about R_AVR_16 and R_AVR_32. Debug information is emitted in the final object file, and understood by 'avr-readelf --debug-dump' from AVR-GCC. llvm-dwarfdump is yet to understand how to dump AVR DWARF symbols. llvm-svn: 351720
* Revert "[AVR] Insert unconditional branch when inserting MBBs between blocks ↵Dylan McKay2019-01-211-9/+0
| | | | | | | | | | | | with fallthrough" This reverts commit r351718. Carl pointed out that the unit test could be improved. This patch will be recommitted once the test is made more resilient. llvm-svn: 351719
* [AVR] Insert unconditional branch when inserting MBBs between blocks with ↵Dylan McKay2019-01-211-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | fallthrough This updates the AVR Select8/Select16 expansion code so that, when inserting the two basic blocks for true and false conditions, any existing fallthrough on the previous block is preserved. Prior to this patch, if the block before the Select pseudo fell through to the subsequent block, two new basic blocks would be inserted at the prior fallthrough point, changing the fallthrough destination. The predecessor or successor lists were not updated, causing the BranchFolding pass at -O1 and above the rearrange basic blocks, causing an infinite loop. Not to mention the unconditional fallthrough to the true block is incorrect in of itself. This patch modifies the Select8/16 expansion so that, if inserting true and false basic blocks at a fallthrough point, the implicit branch is preserved by means of an explicit, unconditional branch to the previous fallthrough destination. Thanks to Carl Peto for reporting this bug. This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123. llvm-svn: 351718
* Tentative fix for r351701 and gcc 6.2 build on ubuntuSerge Guelton2019-01-201-2/+3
| | | | llvm-svn: 351705
* Replace llvm::isPodLike<...> by llvm::is_trivially_copyable<...>Serge Guelton2019-01-201-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As noted in https://bugs.llvm.org/show_bug.cgi?id=36651, the specialization for isPodLike<std::pair<...>> did not match the expectation of std::is_trivially_copyable which makes the memcpy optimization invalid. This patch renames the llvm::isPodLike trait into llvm::is_trivially_copyable. Unfortunately std::is_trivially_copyable is not portable across compiler / STL versions. So a portable version is provided too. Note that the following specialization were invalid: std::pair<T0, T1> llvm::Optional<T> Tests have been added to assert that former specialization are respected by the standard usage of llvm::is_trivially_copyable, and that when a decent version of std::is_trivially_copyable is available, llvm::is_trivially_copyable is compared to std::is_trivially_copyable. As of this patch, llvm::Optional is no longer considered trivially copyable, even if T is. This is to be fixed in a later patch, as it has impact on a long-running bug (see r347004) Note that GCC warns about this UB, but this got silented by https://reviews.llvm.org/D50296. Differential Revision: https://reviews.llvm.org/D54472 llvm-svn: 351701
* AMDGPU: Legalize more bitcastsMatt Arsenault2019-01-201-5/+7
| | | | llvm-svn: 351700
* GlobalISel: Add isPointer legality predicatesMatt Arsenault2019-01-201-0/+14
| | | | llvm-svn: 351699
* AMDGPU/GlobalISel: Really legalize exts from i1Matt Arsenault2019-01-201-1/+2
| | | | | | | | There is a combine that was hiding these tests not actually testing what they should be, although they were producing the expected end result. llvm-svn: 351698
* [X86] Auto upgrade VPCOM/VPCOMU intrinsics to generic integer comparisonsSimon Pilgrim2019-01-203-84/+25
| | | | | | | | This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp. Noticed while cleaning up vector integer comparison costs for PR40376. llvm-svn: 351697
* GlobalISel: Implement widenScalar for basic FP opsMatt Arsenault2019-01-202-10/+21
| | | | llvm-svn: 351696
* AMDGPU/GlobalISel: Legalize f32->f16 fptruncMatt Arsenault2019-01-201-1/+1
| | | | llvm-svn: 351695
* AMDGPU/GlobalISel: Fix some crashs in g_unmerge_values/g_merge_valuesMatt Arsenault2019-01-201-12/+73
| | | | | | | | | | | This was crashing in the predicate function assuming the value is a vector. Copy more of what AArch64 uses. This probably needs more refinement later, but I don't exactly understand what it means in some cases, particularly since any legalization for these seems to be missing. llvm-svn: 351693
* AMDGPU/GlobalISel: Regbank select for fpextMatt Arsenault2019-01-201-0/+1
| | | | llvm-svn: 351692
* AMDGPU/GlobalISel: Cleanup legality for extensionsMatt Arsenault2019-01-201-10/+6
| | | | llvm-svn: 351691
* [X86] Auto upgrade old style VPCOM/VPCOMU intrinsics to generic integer ↵Simon Pilgrim2019-01-201-22/+47
| | | | | | | | | | | | comparisons We were upgrading these to the new style VPCOM/VPCOMU intrinsics (which includes the condition code immediate), but we'll be getting rid of those shortly, so convert these to generics first. This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp. Noticed while cleaning up vector integer comparison costs for PR40376. llvm-svn: 351690
* [CostModel][X86] Add explicit vector select costsSimon Pilgrim2019-01-201-0/+41
| | | | | | | | | | Prior to SSE41 (and sometimes on AVX1), vector select has to be performed as a ((X & C)|(Y & ~C)) bit select. Exposes a couple of issues with the min/max reduction costs (which only go down to SSE42 for some reason). The increase pre-SSE41 selection costs also prevent a couple of tests from firing any longer, so I've either tweaked the target or added AVX tests as well to the existing SSE2 tests. llvm-svn: 351685
* [CostModel][X86] Add explicit fcmp costs for pre-SSE42 targetsSimon Pilgrim2019-01-201-0/+11
| | | | | | Typical throughputs: cmpss/cmpps = 1cy and cmpsd/cmppd = 2cy before the Core2 era llvm-svn: 351684
* [TTI][X86] Reordered getCmpSelInstrCost cost tables in descending ISA order. ↵Simon Pilgrim2019-01-201-24/+24
| | | | | | | | NFCI. Minor tidyup to make it clearer whats going on before adding additional costs. llvm-svn: 351683
* [AVR] Replace two references to ARM's 't2_so_imm' type commentsDylan McKay2019-01-201-2/+2
| | | | | | | | | | | These were originally introduced in a copy-paste committed in r351526. The reference to 't2_so_imm' have been updated to 'imm_com8' so the comment is now accurate. Thanks to Eli Friedman for noticing this. llvm-svn: 351674
* [AVR] Fix codegen bug in 16-bit loadsDylan McKay2019-01-202-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to instructions of this pattern: ld $GPR8, [PTR:XYZ]+ ld $GPR8, [PTR]+1 This has a problem; the [PTR] is incremented in-place once, but never decremented. Future uses of the same pointer will use the now clobbered value, leading to the pointer being incorrect by an offset of one. This patch modifies the expansion code of the LDWRdPtr pseudo instruction so that the pointer variable is not silently clobbered in future uses in the same live range. Bug first reported by Keshav Kini. Patch by Kaushik Phatak. llvm-svn: 351673
* Revert "[AVR] Fix codegen bug in 16-bit loads"Dylan McKay2019-01-201-5/+5
| | | | | | | | | | | This reverts commit r351544. In that commit, I had mistakenly misattributed the issue submitter as the patch author, Kaushik Phatak. The patch will be recommitted immediately with the correct attribution. llvm-svn: 351672
* [ConstantMerge] Factor out check for un-mergeable globals, NFCVedant Kumar2019-01-201-10/+12
| | | | llvm-svn: 351671
* [X86] Add masked MCVTSI2P/MCVTUI2P ISD opcodes to model the cvtqq2ps ↵Craig Topper2019-01-195-17/+88
| | | | | | | | cvtuqq2ps nodes that produce less than 128-bits of results. These nodes zero the upper half of the result and can't be represented with vselect. llvm-svn: 351666
* Update more file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-196-24/+18
| | | | | | | | | | | | | | | | | | to reflect the new license. These used slightly different spellings that defeated my regular expressions. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351648
* [InstCombine] Simplify cttz/ctlz + icmp ugt/ultNikita Popov2019-01-192-11/+70
| | | | | | | | | | | | Followup to D55745, this time handling comparisons with ugt and ult predicates (which are the canonical forms for non-equality predicates). For ctlz we can convert into a simple icmp, for cttz we can convert into a mask check. Differential Revision: https://reviews.llvm.org/D56355 llvm-svn: 351645
* [NFC] Fix unused variable warnings in Release buildsJohannes Doerfert2019-01-191-0/+2
| | | | llvm-svn: 351641
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-192961-11844/+8883
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* Convert two more files that were using Windows line endings and removeChandler Carruth2019-01-191-22/+22
| | | | | | | a stray single '\r' from one file. These are the last line ending issues I can find in the files containing parts of LLVM's file headers. llvm-svn: 351634
* Enable IPConstantPropagation to work with abstract call sitesJohannes Doerfert2019-01-191-12/+23
| | | | | | | | | | | | | This modification of the currently unused inter-procedural constant propagation pass (IPConstantPropagation) shows how abstract call sites enable optimization of callback calls alongside direct and indirect calls. Through minimal changes, mostly dealing with the partial mapping of callbacks, inter-procedural constant propagation was enabled for callbacks, e.g., OpenMP runtime calls or pthreads_create. Differential Revision: https://reviews.llvm.org/D56447 llvm-svn: 351628
* AbstractCallSite -- A unified interface for (in)direct and callback callsJohannes Doerfert2019-01-194-0/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An abstract call site is a wrapper that allows to treat direct, indirect, and callback calls the same. If an abstract call site represents a direct or indirect call site it behaves like a stripped down version of a normal call site object. The abstract call site can also represent a callback call, thus the fact that the initially called function (=broker) may invoke a third one (=callback callee). In this case, the abstract call side hides the middle man, hence the broker function. The result is a representation of the callback call, inside the broker, but in the context of the original instruction that invoked the broker. Again, there are up to three functions involved when we talk about callback call sites. The caller (1), which invokes the broker function. The broker function (2), that may or may not invoke the callback callee. And finally the callback callee (3), which is the target of the callback call. The abstract call site will handle the mapping from parameters to arguments depending on the semantic of the broker function. However, it is important to note that the mapping is often partial. Thus, some arguments of the call/invoke instruction are mapped to parameters of the callee while others are not. At the same time, arguments of the callback callee might be unknown, thus "null" if queried. This patch introduces also !callback metadata which describe how a callback broker maps from parameters to arguments. This metadata is directly created by clang for known broker functions, provided through source code attributes by the user, or later deduced by analyses. For motivation and additional information please see the corresponding talk (slides/video) https://llvm.org/devmtg/2018-10/talk-abstracts.html#talk20 as well as the LCPC paper http://compilers.cs.uni-saarland.de/people/doerfert/par_opt_lcpc18.pdf Differential Revision: https://reviews.llvm.org/D54498 llvm-svn: 351627
* Reapply "[CGP] Check for existing inttotpr before creating new one"Roman Tereshin2019-01-191-4/+18
| | | | | | Original commit: r351582 llvm-svn: 351626
* [MergeFunc] Allow merging identical vararg functions using aliasesVedant Kumar2019-01-191-11/+14
| | | | | | | | | | | Thanks to Nikita Popov for pointing out this missed case. This is a follow-up to r351411, which disabled function merging for vararg functions outright due to a miscompile (see llvm.org/PR40345). Differential Revision: https://reviews.llvm.org/D56865 llvm-svn: 351624
* [HotColdSplit] Mark inherently cold functions as suchVedant Kumar2019-01-191-20/+39
| | | | | | | | | | If an inherently cold function is found, mark it as cold. For now this means applying the `cold` and `minsize` attributes. As a drive-by, revisit and clean up the criteria for considering a function for splitting. Add tests. llvm-svn: 351623
* [HotColdSplit] Remove a set which tracked split functions (NFC)Vedant Kumar2019-01-191-8/+3
| | | | | | | Use the begin/end iterator idiom to avoid visiting split functions, instead of doing a set lookup. llvm-svn: 351622
* [CodeExtractor] Emit lifetime markers around reloads of outputsVedant Kumar2019-01-191-66/+71
| | | | | | | | | | | | | | | | CodeExtractor permits extracting a region of blocks from a function even when values defined within the region are used outside of it. This is typically done by creating an alloca in the original function and reloading the alloca after a call to the extracted function. Wrap the reload in lifetime start/end markers to promote stack coloring. Suggested by Sergei Kachkov! Differential Revision: https://reviews.llvm.org/D56045 llvm-svn: 351621
* Revert "Reapply "[CGP] Check for existing inttotpr before creating new one""Roman Tereshin2019-01-191-17/+4
| | | | | | | | | | This reverts commit r351618. Compiler RT + ASAN tests are failing for PowerPC. Not sure how would I reproduce these on macOS, so reverting (again) until I do. llvm-svn: 351619
* Reapply "[CGP] Check for existing inttotpr before creating new one"Roman Tereshin2019-01-191-4/+17
| | | | | | Original commit: r351582 llvm-svn: 351618
* Revert r351584: "GlobalISel: Verify g_zextload and g_sextload"Amara Emerson2019-01-191-14/+1
| | | | | | This new assertion triggered on the AArch64 GlobalISel bots. Reverting while it's being investigated. llvm-svn: 351617
* [X86] Deduplicate static calling convention helpers for code size, NFCReid Kleckner2019-01-196-60/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Right now we include ${TGT}GenCallingConv.inc once per each instruction selection method implemented by ${TGT}: - ${TGT}ISelLowering.cpp - ${TGT}CallLowering.cpp - ${TGT}FastISel.cpp Instead, add a mechanism to tablegen for marking a particular convention as "External", which causes tablegen to emit into the ::llvm namespace, instead of as a static helper. This allows us to provide a header to forward declare it, so we can simply call the function from all the places it is referenced. Typically the calling convention analyzer is called indirectly, so it doesn't benefit from inlining. This saves a bit of final binary size, but mostly just saves object file size: before after diff artifact 12852K 12492K -360K X86ISelLowering.cpp.obj 4640K 4280K -360K X86FastISel.cpp.obj 1704K 2092K +388K X86CallingConv.cpp.obj 52448K 52336K -112K llc.exe I didn't collect before numbers for X86CallLowering.cpp.obj, which is for GlobalISel, but we should save 360K there as well. This patch applies the strategy to the X86 backend, but there is no reason it couldn't be applied to the other backends that implement multiple ISel strategies, like AArch64. Reviewers: craig.topper, hfinkel, efriedma Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D56883 llvm-svn: 351616
* Remove F_modify flag from FileOutputBuffer.Rui Ueyama2019-01-191-26/+11
| | | | | | | | This code is dead. There is no use of the feature in the entire LLVM codebase. Differential Revision: https://reviews.llvm.org/D56939 llvm-svn: 351613
* AMDGPU/GlobalISel: Legalize more types for selectMatt Arsenault2019-01-181-2/+4
| | | | llvm-svn: 351599
* Revert "[CGP] Check for existing inttotpr before creating new one"Roman Tereshin2019-01-181-13/+4
| | | | | | | | This reverts commit r351582. Bots are failing. Reverting this to fix and re-commit later. llvm-svn: 351598
* AMDGPU/GlobalISel: Legalize illegal g_constantMatt Arsenault2019-01-181-4/+9
| | | | llvm-svn: 351596
* GlobalISel: Verify G_BITCASTMatt Arsenault2019-01-181-0/+13
| | | | llvm-svn: 351594
* GlobalISel: Verify G_ICMP/G_FCMP vector typesMatt Arsenault2019-01-181-0/+11
| | | | llvm-svn: 351591
OpenPOWER on IntegriCloud