summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [WebAssembly] Handle undefined lane indices in SIMD patternsThomas Lively2018-10-192-2/+40
| | | | | | | | | | | | | | | | Summary: Undefined indices in shuffles can be used when not all lanes of the output vector will be used. This happens for example in the expansion of vector reduce operations. Regardless, undefs are legal as lane indices in IR and should be supported. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53057 llvm-svn: 344803
* Fix a use-after-RAUW bug in large GEP splittingKrzysztof Pszeniczny2018-10-191-3/+14
| | | | | | | | | | | | | | | | | | | Summary: Large GEP splitting, introduced in rL332015, uses a `DenseMap<AssertingVH<Value>, ...>`. This causes an assertion to fail (in debug builds) or undefined behaviour to occur (in release builds) when a value is RAUWed. This manifested itself in the 7zip benchmark from the llvm test suite built on ARM with `-fstrict-vtable-pointers` enabled while RAUWing invariant group launders and splits in CodeGenPrepare. This patch merges the large offsets of the argument and the result of an invariant.group strip/launder intrinsic before RAUWing. Reviewers: Prazek, javed.absar, haicheng, efriedma Reviewed By: Prazek, efriedma Subscribers: kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D51936 llvm-svn: 344802
* [InstCombine] InstCombine and InstSimplify for minimum and maximumThomas Lively2018-10-193-12/+46
| | | | | | | | | | | | Summary: Depends on D52765 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52766 llvm-svn: 344799
* [ConstantFolding] Constant fold minimum and maximum intrinsicsThomas Lively2018-10-191-0/+14
| | | | | | | | | | | | Summary: Depends on D52764 Reviewers: aheejin, dschuff Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52765 llvm-svn: 344796
* [dwarfdump] Hide ranges in diff-mode.Jonas Devlieghere2018-10-191-1/+3
| | | | | | | | | llvm-dwarfdump --diff should not print DW_AT_ranges. This patch fixes that. Differential revision: https://reviews.llvm.org/D53353 llvm-svn: 344794
* [InstCombine] use m_Neg() in dyn_castNegVal() to match vectors with undef eltsSanjay Patel2018-10-191-2/+3
| | | | llvm-svn: 344793
* [Hexagon] Remove support for V4Krzysztof Parzyszek2018-10-1923-757/+567
| | | | llvm-svn: 344791
* [MC][DWARF][AsmParser] Ensure nested CFI frames are diagnosed.Kristina Brooks2018-10-192-5/+10
| | | | | | | | | | | | | | | | | | This avoids a crash (with asserts) or bad codegen (without asserts) in Dwarf streamer later on. This patch fixes this condition in MCStreamer and propogates SMLoc down when it's available with an added bonus of source locations for those specific types of errors. Further patches could use similar improvements as currently most non-Windows CFI directives lack an SMLoc parameter. Modified an existing test to verify source location propogation and added an object-file version of it to verify that it does not crash in addition to a standalone test to only ensure it does not crash. Differential Revision: https://reviews.llvm.org/D51695 llvm-svn: 344781
* Use llvm::{all,any,none}_of instead std::{all,any,none}_of. NFCFangrui Song2018-10-195-34/+24
| | | | llvm-svn: 344774
* [TI removal] Remove `TerminatorInst` from the IR type system!Chandler Carruth2018-10-191-80/+73
| | | | llvm-svn: 344769
* [TI removal] Switch some newly added code over to use `Instruction`Chandler Carruth2018-10-192-8/+7
| | | | | | directly. llvm-svn: 344768
* [TI removal] Update the C API for the move away from `TerminatorInst`.Chandler Carruth2018-10-181-3/+8
| | | | | | | | | | | | | | | | | This updates the C API for the removal of `TerminatorInst`. It converts the type query to a predicate query and moves the generic methods to work on `Instruction` instances that satisfy this predicate rather than requiring a specific type. It also clarifies that the C API wrapping `BasicBlock::getTerminator` just returns an `Instruction`. Because this was always wrapped opaquely as a value and the functions consuming these values will work on `Instruction` objects, this shouldn't break any clients. This is a completely compatible change to the C API. Differential Revision: https://reviews.llvm.org/D52968 llvm-svn: 344764
* Make Function::getInstructionCount constMircea Trofin2018-10-181-2/+2
| | | | | | | | | | | | | | Summary: Function::getInstructionCount can be const. Reviewers: davidxl, paquette Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53378 llvm-svn: 344754
* Revert r344693 ("[ARM] bottom-top mul support in ARMParallelDSP")Eli Friedman2018-10-181-194/+27
| | | | | | | Still causing failures on the polly-aosp buildbot; I'll follow up with a reduced testcase. llvm-svn: 344752
* [Pipeliner] copyToPhi DAG Mutation to improve scheduling.Sumanth Gundapaneni2018-10-181-1/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | In a loop, create artificial dependences between the source of a COPY/REG_SEQUENCE to the use in next iteration. Eg: SRC ----Data Dep--> COPY COPY ---Anti Dep--> PHI (implies, to be used in next iteration) PHI ----Data Dep--> USE This patches creates USE ----Artificial Dep---> SRC This will effectively schedule the COPY late to eliminate additional copies. Before this patch, the schedule can be SRC, COPY, USE : The COPY is used in next iteration and it needs to be preserved. After this patch, the schedule can be USE, SRC, COPY : The COPY is used in next iteration and the live interval is reduced. Differential Revision: https://reviews.llvm.org/D53303 llvm-svn: 344748
* [LV] Fold tail by masking to vectorize loops of arbitrary trip count under ↵Ayal Zaks2018-10-184-38/+188
| | | | | | | | | | | | | | | | | | | | | | | | opt for size When optimizing for size, a loop is vectorized only if the resulting vector loop completely replaces the original scalar loop. This holds if no runtime guards are needed, if the original trip-count TC does not overflow, and if TC is a known constant that is a multiple of the VF. The last two TC-related conditions can be overcome by 1. rounding the trip-count of the vector loop up from TC to a multiple of VF; 2. masking the vector body under a newly introduced "if (i <= TC-1)" condition. The patch allows loops with arbitrary trip counts to be vectorized under -Os, subject to the existing cost model considerations. It also applies to loops with small trip counts (under -O2) which are currently handled as if under -Os. The patch does not handle loops with reductions, live-outs, or w/o a primary induction variable, and disallows interleave groups. (Third, final and main part of -) Differential Revision: https://reviews.llvm.org/D50480 llvm-svn: 344743
* [DA] DivergenceAnalysis for unstructured, reducible CFGsNicolai Haehnle2018-10-183-0/+807
| | | | | | | | | | | | | | | | | | | | | | Summary: This is patch 2 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433). This patch contains a generic divergence analysis implementation for unstructured, reducible Control-Flow Graphs. It contains two new classes. The `SyncDependenceAnalysis` class lazily computes sync dependences, which relate divergent branches to points of joining divergent control. The `DivergenceAnalysis` class contains the generic divergence analysis implementation. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: sameerds, kristina, nhaehnle, xbolva00, tschuett, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D51491 llvm-svn: 344734
* Add a emitUnaryFloatFnCall version that fetches the function name from TLIMikael Holmen2018-10-183-11/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In several places in the code we use the following pattern: if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) { [...] Value *Res = emitUnaryFloatFnCall(X, TLI.getName(LibFunc_tan), B, Attrs); [...] } In short, we check if there is a lib-function for a certain type, and then we _always_ fetch the name of the "double" version of the lib function and construct a call to the appropriate function, that we just checked exists, using that "double" name as a basis. This is of course a problem in cases where the target doesn't support the "double" version, but e.g. only the "float" version. In that case TLI.getName(LibFunc_tan) returns "", and emitUnaryFloatFnCall happily appends an "f" to "", and we erroneously end up with a call to a function called "f". To solve this, the above pattern is changed to if (hasUnaryFloatFn(&TLI, Ty, LibFunc_tan, LibFunc_tanf, LibFunc_tanl)) { [...] Value *Res = emitUnaryFloatFnCall(X, &TLI, LibFunc_tan, LibFunc_tanf, LibFunc_tanl, B, Attrs); [...] } I.e instead of first fetching the name of the "double" version and then letting emitUnaryFloatFnCall() add the final "f" or "l", we let emitUnaryFloatFnCall() fetch the right name from TLI. Reviewers: eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, bjope, llvm-commits Differential Revision: https://reviews.llvm.org/D53370 llvm-svn: 344725
* [X86] Support for the mno-tls-direct-seg-refs flagKristina Brooks2018-10-181-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Allows to disable direct TLS segment access (%fs or %gs). GCC supports a similar flag, it can be useful in some circumstances, e.g. when a thread context block needs to be updated directly from user space. More info and specific use cases: https://bugs.llvm.org/show_bug.cgi?id=16145 There is another revision for clang as well. Related: D53102 All X86 CodeGen tests appear to pass: ``` [46/47] Running lit suite /SourceCache/llvm-trunk-8.0/test/CodeGen Testing Time: 23.17s Expected Passes : 3801 Expected Failures : 15 Unsupported Tests : 8021 ``` Reviewed by: Craig Topper. Patch by nruslan (Ruslan Nikolaev). Differential Revision: https://reviews.llvm.org/D53103 llvm-svn: 344723
* [TI removal] Switch simple loop unswitch to `Instruction`.Chandler Carruth2018-10-181-5/+5
| | | | llvm-svn: 344719
* [TI removal] Switch NewGVN to directly use `Instruction`.Chandler Carruth2018-10-181-3/+3
| | | | llvm-svn: 344718
* [TI removal] Use `Instruction` instead of `TerminatorInst` forChandler Carruth2018-10-181-2/+2
| | | | | | a variable's type. llvm-svn: 344717
* [TI removal] Update CodeExtractor to use Instruction directly.Chandler Carruth2018-10-181-4/+4
| | | | llvm-svn: 344716
* [TI removal] Switch ObjCARC code to directly use the nice range-basedChandler Carruth2018-10-182-16/+9
| | | | | | | successors API or directly build the iterators out of the terminator instruction and avoid requiring a TerminatorInst variable. llvm-svn: 344715
* [TI removal] Switch MergeFunctions to directly use Instruction API.Chandler Carruth2018-10-181-1/+1
| | | | llvm-svn: 344714
* [TI removal] Switch an analysis to just use Instruction.Chandler Carruth2018-10-181-5/+5
| | | | llvm-svn: 344713
* Port libcxxabi r344607 into llvmPavel Labath2018-10-172-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The original commit message was: This uses CRTP (for performance reasons) to allow a user the override demangler functions to implement custom parsing logic. The motivation for this is LLDB, which needs to occasionaly modify the mangled names. One such instance is already implemented via the TypeCallback member, but this is very specific functionality which does not help with any other use case. Currently we have a use case for modifying the constructor flavours, which would require adding another callback. This approach does not scale. With CRTP, the user (LLDB) can override any function it needs without any special support from the demangler library. After LLDB is ported to use this instead of the TypeCallback mechanism, the callback can be removed. The only difference here is the addition of a unit test which exercises the CRTP mechanism to override a function in the parser. Reviewers: erik.pilkington, rsmith, EricWF Subscribers: mgorny, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D53300 llvm-svn: 344703
* AMDGPU: Avoid selecting ds_{read,write}2_b32 on SINicolai Haehnle2018-10-173-3/+26
| | | | | | | | | | | | | | | | | | | | | | Summary: To workaround a hardware issue in the (base + offset) calculation when base is negative. The impact on code quality should be limited since SILoadStoreOptimizer still runs afterwards and is able to combine loads/stores based on known sign information. This fixes visible corruption in Hitman on SI (easily reproducible by running benchmark mode). Change-Id: Ia178d207a5e2ac38ae7cd98b532ea2ae74704e5f Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99923 Reviewers: arsenm, mareko Subscribers: jholewinski, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53160 llvm-svn: 344698
* StructurizeCFG: Simplify inserted PHI nodesNicolai Haehnle2018-10-171-1/+23
| | | | | | | | | | | | | | | Summary: This improves subsequent divergence analysis in some cases. Change-Id: I5e95e7ec7fd3fa80d414d1a53a02fea23e3d67d3 Reviewers: arsenm, rampitec Subscribers: jvesely, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D53316 llvm-svn: 344697
* AMDGPU: Divergence-driven selection of scalar buffer load intrinsicsNicolai Haehnle2018-10-176-220/+90
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Moving SMRD to VMEM in SIFixSGPRCopies is rather bad for performance if the load is really uniform. So select the scalar load intrinsics directly to either VMEM or SMRD buffer loads based on divergence analysis. If an offset happens to end up in a VGPR -- either because a floating point calculation was involved, or due to other remaining deficiencies in SIFixSGPRCopies -- we use v_readfirstlane. There is some unrelated churn in tests since we now select MUBUF offsets in a unified way with non-scalar buffer loads. Change-Id: I170e6816323beb1348677b358c9d380865cd1a19 Reviewers: arsenm, alex-t, rampitec, tpr Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53283 llvm-svn: 344696
* [ARM] bottom-top mul support in ARMParallelDSPSam Parker2018-10-171-27/+194
| | | | | | | | | | | | | | Previously reverted in rL343082. Original commit message: On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 344693
* AMDGPU: Remove dead TableGen codeNicolai Haehnle2018-10-171-2/+0
| | | | | | | | | | | | | Summary: Change-Id: Ic1f2c1d0cf9e90a0baa9fc6bacd0d3c386069fb0 Reviewers: tpr Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D53318 Change-Id: Ib4d143c898801e5cf6cb9999a495d62c91ae77fb llvm-svn: 344691
* [NFC] Remove GOTO from SCEVMax Kazantsev2018-10-171-20/+14
| | | | llvm-svn: 344687
* [NewPM] teach -passes= to emit meaningful error messagesFedor Sergeev2018-10-172-162/+218
| | | | | | | | | | | | | | All the PassBuilder::parse interfaces now return descriptive StringError instead of a plain bool. It allows to make -passes/aa-pipeline parsing errors context-specific and thus less confusing. TODO: ideally we should also make suggestions for misspelled pass names, but that requires some extensions to PassBuilder. Reviewed By: philip.pfaffe, chandlerc Differential Revision: https://reviews.llvm.org/D53246 llvm-svn: 344685
* [MIPS GlobalISel] Legalize constantsPetar Jovanovic2018-10-171-1/+24
| | | | | | | | | | Legalize s1, s8, s16 and s64 G_CONSTANT for MIPS32. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D53077 llvm-svn: 344684
* [ARM] Do not fuse VADD and VMUL, continued (2/2)Sjoerd Meijer2018-10-171-2/+4
| | | | | | | | | This is patch 2/2, following up on D53314, and is the functional change to prevent fusing mul + add sequences into VFMAs. Differential revision: https://reviews.llvm.org/D53315 llvm-svn: 344683
* [LoopPredication] add some simple statsFedor Sergeev2018-10-171-0/+8
| | | | | | | Just adding some useful statistics to LoopPredication pass which was lacking any of these. llvm-svn: 344681
* [ARM] Follow up of rL344671, attempt to pacify a buildbotSjoerd Meijer2018-10-171-1/+1
| | | | | | It was rightfully complaining about an unpretty logical expression. llvm-svn: 344677
* [ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2)Sjoerd Meijer2018-10-173-42/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow up of rL342874, which stopped fusing muls and adds into VMLAs for performance reasons on the Cortex-M4 and Cortex-M33. This is a serie of 2 patches, that is trying to achieve the same for VFMA. The second column in the table below shows what we were generating before rL342874, the third column what changed with rL342874, and the last column what we want to achieve with these 2 patches: -------------------------------------------------------- | Opt | < rL342874 | >= rL342874 | | |------------------------------------------------------| |-O3 | vmla | vmul | vmul | | | | vadd | vadd | |------------------------------------------------------| |-Ofast | vfma | vfma | vmul | | | | | vadd | |------------------------------------------------------| |-Oz | vmla | vmla | vmla | -------------------------------------------------------- This patch 1/2, is a cleanup of the spaghetti predicate logic on the different VMLA and VFMA codegen rules, so that we can make the final functional change in patch 2/2. This also fixes a typo in the regression test added in rL342874. Differential revision: https://reviews.llvm.org/D53314 llvm-svn: 344671
* [Sanitizer][PassManager] Fix for failing ASan tests on arm-linux-gnueabihfLeonard Chan2018-10-171-1/+3
| | | | | | | | Forgot to initialize the legacy pass in it's constructor. Differential Revision: https://reviews.llvm.org/D53350 llvm-svn: 344659
* [ThinLTO] Add importing stats to thin linkTeresa Johnson2018-10-161-5/+27
| | | | | | | | | | | | | | | Summary: Previously we could only get the number of imported functions and variables from the backend. This adds stats to the thin link where the importing is decided. Reviewers: wmi Subscribers: inglorion, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D53337 llvm-svn: 344658
* [SanitizerCoverage] Don't duplicate code to get section pointersJonathan Metzman2018-10-161-33/+15
| | | | | | | | | | | | | | | | | Summary: Merge code used to get section start and section end pointers for SanitizerCoverage constructors. This includes code that handles getting the start pointers when targeting MSVC. Reviewers: kcc, morehouse Reviewed By: morehouse Subscribers: kcc, hiraditya Differential Revision: https://reviews.llvm.org/D53211 llvm-svn: 344657
* [X86] Match (cmp (and (shr X, C), mask), 0) to BEXTR+TEST.Craig Topper2018-10-161-15/+32
| | | | | | | | | | Without this we match the CMP+AND to a TEST and then match the SHR separately. I'm trusting analyzeCompare to remove the TEST during the peephole pass. Otherwise we need to check the flag users to see if they only use the Z flag. This recovers a case lost by r344270. Differential Revision: https://reviews.llvm.org/D53310 llvm-svn: 344649
* [InstCombine] Cleanup libfunc attribute inferringDavid Bolvansky2018-10-164-56/+74
| | | | | | | | | | | | Reviewers: efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53338 llvm-svn: 344645
* [ORC] Make the VModuleKey optional, propagate it via MaterializationUnit andLang Hames2018-10-1610-71/+70
| | | | | | | | | | | | | | | | | | | | | | MaterializationResponsibility. VModuleKeys are intended to enable selective removal of modules from a JIT session, however for a wide variety of use cases selective removal is not needed and introduces unnecessary overhead. As of this commit, the default constructed VModuleKey value is reserved as a "do not track" value, and becomes the default when adding a new module to the JIT. This commit also changes the propagation of VModuleKeys. They were passed alongside the MaterializationResponsibity instance in XXLayer::emit methods, but are now propagated as part of the MaterializationResponsibility instance itself (and as part of MaterializationUnit when stored in a JITDylib). Associating VModuleKeys with MaterializationUnits in this way should allow for a thread-safe module removal mechanism in the future, even when a module is in the process of being compiled, by having the MaterializationResponsibility object check in on its VModuleKey's state before commiting its results to the JITDylib. llvm-svn: 344643
* Revert "[WebAssembly] LSDA info generation"Krasimir Georgiev2018-10-1616-264/+62
| | | | | | | | This reverts commit r344575. Newly introduced test eh-lsda.ll.test fails with use-after-free under ASAN build. llvm-svn: 344639
* [PATCH] [NFC][AArch64] Fix refactoring of macro fusionEvandro Menezes2018-10-161-8/+4
| | | | | | Fix compiler error. llvm-svn: 344632
* [Intrinsic] Signed Saturation Addition IntrinsicLeonard Chan2018-10-1610-0/+114
| | | | | | | | | | | Add an intrinsic that takes 2 integers and perform saturation addition on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53053 llvm-svn: 344629
* [NFC][ARM] Refactor macro fusionEvandro Menezes2018-10-161-19/+5
| | | | | | Simplify code for wildcards. llvm-svn: 344625
* [NFC][AArch64] Refactor macro fusionEvandro Menezes2018-10-161-76/+90
| | | | | | Simplify API of checking functions. llvm-svn: 344624
OpenPOWER on IntegriCloud