summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* PR45350: Handle unsized array CXXConstructExprs in constant evaluationRichard Smith2020-05-182-1/+42
| | | | | | of array new expressions with runtime bound. (cherry picked from commit 9a7eda1bece887ca9af085d79fe6e4fb8826dcda)
* [lldb] [PECOFF] Only use PECallFrameInfo on the one supported architectureMartin Storsjö2020-05-181-0/+3
| | | | | | | | | The RuntimeFunction struct, which PECallFrameInfo interprets, has a different layout and differnet semantics on all architectures. Differential Revision: https://reviews.llvm.org/D77000 (cherry picked from commit aa786b881fc89a2a9883bff77912f2053126f95b)
* [CodeGen] fix inline builtin-related breakage from D78162George Burgess IV2020-05-072-3/+25
| | | | | | | | | | | In cases where we have multiple decls of an inline builtin, we may need to go hunting for the one with a definition when setting function attributes. An additional test-case was provided on https://github.com/ClangBuiltLinux/linux/issues/979 (cherry picked from commit 94908088a831141cfbdd15fc5837dccf30cfeeb6)
* [CodeGen] only add nobuiltin to inline builtins if we'll emit themGeorge Burgess IV2020-05-072-1/+27
| | | | | | | | | | | | | | There are some inline builtin definitions that we can't emit (isTriviallyRecursive & callers go into why). Marking these nobuiltin is only useful if we actually emit the body, so don't mark these as such unless we _do_ plan on emitting that. This suboptimality was encountered in Linux (see some discussion on D71082, and https://github.com/ClangBuiltLinux/linux/issues/979). Differential Revision: https://reviews.llvm.org/D78162 (cherry picked from commit 2dd17ff08165e6118e70f00e22b2c36d2d4e0a9a)
* [profile] Don't crash when forking in several threadsCalixte Denizet2020-05-075-37/+204
| | | | | | | | | | | | | | | | | | | Summary: When forking in several threads, the counters were written out in using the same global static variables (see GCDAProfiling.c): that leads to crashes. So when there is a fork, the counters are resetted in the child process and they will be dumped at exit using the interprocess file locking. When there is an exec, the counters are written out and in case of failures they're resetted. Reviewers: jfb, vsk, marco-c, serge-sans-paille Reviewed By: marco-c, serge-sans-paille Subscribers: llvm-commits, serge-sans-paille, dmajor, cfe-commits, hiraditya, dexonsmith, #sanitizers, marco-c, sylvestre.ledru Tags: #sanitizers, #clang, #llvm Differential Revision: https://reviews.llvm.org/D78477 (cherry picked from commit bec223a9bc4eb9747993ee9a4c1aa135c32123e6)
* [clang-format] [PR45357] Fix issue found with operator spacingmydeveloperday2020-05-072-1/+58
| | | | | | | | | | | | | | | | | | | Summary: This is a tentative fix for https://bugs.llvm.org/show_bug.cgi?id=45357 Spaces seem to be introduced between * and * due to changes brought in for {D69573} Reviewers: sylvestre.ledru, mitchell-stellar, sammccall, Abpostelnicu, krasimir, jbcoe Reviewed By: Abpostelnicu Subscribers: tstellar, hans, Abpostelnicu, cfe-commits Tags: #clang, #clang-format Differential Revision: https://reviews.llvm.org/D78879 (cherry picked from commit b01dca50085768f1f1a5ad21a685906d48c38816)
* clang-format: Fix pointer alignment for overloaded operators (PR45107)Hans Wennborg2020-05-072-14/+39
| | | | | | | | | | | | | | | | | | This fixes a regression from D69573 which broke the following example: $ echo 'operator C<T>*();' | bin/clang-format --style=Chromium operator C<T> *(); (There should be no space before the asterisk.) It seems the problem is in TokenAnnotator::spaceRequiredBetween(), which only looked at the token to the left of the * to see if it was a type or not. That code only handled simple types or identifiers, not templates or qualified types. This patch addresses that. Differential revision: https://reviews.llvm.org/D76850 (cherry picked from commit eb85e90350e93a64279139e7eca9ca40c8fbf5eb)
* [libclang] Remove duplicate dependency on LLVMSupportJan Korous2020-05-071-1/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D79451 (cherry picked from commit 02b303321d3f0d3b2c69f68aa25560848dd61f98)
* [MachineSink] Fix for breaking phi edges with instructions with multiple defsDavid Green2020-05-072-17/+69
| | | | | | | | | | | | | BreakPHIEdge would be set based on whether the instruction needs to insert a new critical edge to allow sinking into a block where the uses are PHI nodes. But for instructions with multiple defs it would be reset on the second def, allowing the instruciton to sink where it should not. Fixes PR44981 Differential Revision: https://reviews.llvm.org/D78087 (cherry picked from commit 44c4ba34d001dcf538d7396007b5611d6f697f86)
* PR45000: Let Sema::SubstParmVarDecl handle default args of lambdas in ↵Aaron Puchert2020-05-067-21/+31
| | | | | | | | | | | | | | | | | | | | initializers Summary: We extend the behavior for local functions and methods of local classes to lambdas in variable initializers. The initializer is not a separate scope, but we treat it as such. We also remove the (faulty) instantiation of default arguments in TreeTransform::TransformLambdaExpr, because it doesn't do proper initialization, and if it did, we would do it twice (and thus also emit eventual errors twice). Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D76038 (cherry picked from commit f43859a099fa3587123717be941fa63ba8d0d4f2)
* BPF: fix a CORE optimization bugYonghong Song2020-05-062-1/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the test case in this patch like below struct t { int a; } __attribute__((preserve_access_index)); int foo(void *); int test(struct t *arg) { long param[1]; param[0] = (long)&arg->a; return foo(param); } The IR right before BPF SimplifyPatchable phase: %1:gpr = LD_imm64 @"llvm.t:0:0$0:0" %2:gpr = LDD killed %1:gpr, 0 %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %2:gpr STD killed %3:gpr, %stack.0.param, 0 After SimplifyPatchable phase, the incorrect IR is generated: %1:gpr = LD_imm64 @"llvm.t:0:0$0:0" %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %1:gpr CORE_MEM killed %3:gpr, 306, %0:gpr, @"llvm.t:0:0$0:0" Note that CORE_MEM pseudo op is introduced to encode memory operations related to CORE. In the above, we intend to check whether we have a store like *(%3:gpr + 0) = ... and if this is the case, we could replace it with *(%0:gpr + @"llvm.t:0:0$0:0"_ = ... Unfortunately, in the above, IR for the store is *(%stack.0.param + 0) = %3:gpr and transformation should not happen. Note that we won't have problem if the actual CORE dereference (arg->a) happens. This patch fixed the problem by skip CORE optimization if the use of ADD_rr result is not the base address of the store operation. Differential Revision: https://reviews.llvm.org/D78466 (cherry picked from commit 3cb7e7bf959dcd3b8080986c62e10a75c7af43f0)
* [Sema] Allow function attribute patchable_function_entry on aarch64_beFangrui Song2020-05-062-1/+2
| | | | | | | | Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D79495 (cherry picked from commit 57a1c1be53aeea521747dd2f4b0097831341bea5)
* [llvm-objcopy] Avoid invalid Sec.Offset after D79229Fangrui Song2020-05-031-4/+4
| | | | | | | | | | | | To avoid undefined behavior caught by -fsanitize=undefined on binary-paddr.test void SectionWriter::visit(const Section &Sec) { if (Sec.Type != SHT_NOBITS) // Sec.Contents is empty while Sec.Offset may be out of bound llvm::copy(Sec.Contents, Out.getBufferStart() + Sec.Offset); } (cherry picked from commit 762fb1c40eea6878c2d6a1f0f1fc7915c8747981)
* [llvm-objcopy] -O binary: skip empty sectionsFangrui Song2020-05-012-8/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | After SHF_ALLOC sections are ordered by LMA: * If initial sections are empty, GNU objcopy skips their contents while we emit leading zeros. (binary-paddr.test %t4) * If trailing sections are empty, GNU objcopy skips their contents while we emit trailing zeros. (binary-paddr.test %t5) This patch matches GNU objcopy's behavior. Linkers don't keep p_memsz PT_LOAD segments. Such empty sections would not have a containing PT_LOAD and `Section::ParentSegment` might be null if linkers fail to optimize the file offsets (lld D79254). In particular, without D79254, the arm Linux kernel's multi_v5_defconfig depends on this behavior: in `vmlinux`, an empty .text_itcm is mapped at a very high address (0xfffe0000) but the kernel does not expect `objcopy -O binary` to create a very large `arch/arm/boot/Image` (0xfffe0000-0xc0000000 ~= 1GiB). See https://bugs.llvm.org/show_bug.cgi?id=45632 Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D79229 (cherry picked from commit ec786906f5feb4dceba1b5338927079e63e78095)
* github actions: Improve abi-compare checkTom Stellard2020-04-301-2/+11
| | | | | * Install universal-ctags so abi-dumper works correctly. * Compile with -Og.
* libclc: cmake configure should depend on file listJan Vesely2020-04-301-0/+10
| | | | | | | | This makes sure targets are rebuilt if a file is added or removed. Reviewer: tstellar Differential Revision: https://reviews.llvm.org/D74662 (cherry picked from commit 814fb658ca262f5c2df47f11d47f91fac188e0d6)
* Add GitHub action for running libclc testsTom Stellard2020-04-291-0/+53
|
* libclc: Pass system libraries to the linker after llvm librariesTom Stellard2020-04-291-1/+1
| | | | | | | | | | | | | | | | | | Summary: The llvm libraries depend on the symbols in the system libaries, so the system libraries need to be added after. Reviewers: jvesely Reviewed By: jvesely Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78535 (cherry picked from commit 174c41defc63db4ac7594e00a5044672ff624a31)
* [Coroutines] Fix PR45130Jun Ma2020-04-292-1/+56
| | | | | | | | | | For now, when final suspend can be simplified by simplifySuspendPoint, handleFinalSuspend is executed as well to remove last case in switch instruction. This patch fixes it. Differential Revision: https://reviews.llvm.org/D76345 (cherry picked from commit 032251e34d17c1cbf21e7571514bb775ed5cdf30)
* [Clang] Fix Hurd toolchain test on a two-stage build with ThinLTOAlexandre Ganea2020-04-296-14/+35
| | | | | | | | | | A two-stage ThinLTO build previously failed the clang/test/Driver/hurd.c test because of a static_cast in "tools::gnutools::Linker::ConstructJob()" which wrongly converted an instance of "clang::driver::toolchains::Hurd" into that of "clang::driver::toolchains::Linux". ThinLTO would later devirtualize the "ToolChain.getDynamicLinker(Args)" call and use "Linux::getDynamicLinker()" instead, causing the test to generate a wrong "-dynamic-linker" linker flag (/lib/ld-linux.so.2 instead of /lib/ld.so) Fixes PR45061. Differential Revision: https://reviews.llvm.org/D75373 (cherry picked from commit 7e77cf473ac9d8f8b65db017d660892f1c8f4b75)
* Add GitHub action for running lldb testsTom Stellard2020-04-291-0/+47
|
* Revert "Re-land [MC] Fix quadratic behavior in addPendingLabel"Tom Stellard2020-04-292-4/+7
| | | | | | | This reverts commit aa97472d211df67e91e8c1dd3188a0fb2ff942c8. This commit broke ABI compatibility: https://github.com/llvm/llvm-project/runs/624609989
* Re-land [MC] Fix quadratic behavior in addPendingLabelAlexandre Ganea2020-04-272-7/+4
| | | | | | | | This was discovered when compiling large unity/blob/jumbo files. Differential Revision: https://reviews.llvm.org/D78775 (cherry picked from commit fd773e8a51b82775f411061117173a21b500642a)
* [PowerPC] Don't generate ST_VSR_SCAL_INT if power8-vector is disabledKai Luo2020-04-222-3/+12
| | | | | | | | | | | | | Summary: In https://bugs.llvm.org/show_bug.cgi?id=45297, it fails selecting instructions for `PPCISD::ST_VSR_SCAL_INT`. The reason it generate the `PPCISD::ST_VSR_SCAL_INT` with `-power8-vector` in IR is PPC's combiner checks `hasP8Altivec` rather than `hasP8Vector`. This patch should resolve PR45297. Differential Revision: https://reviews.llvm.org/D76773 (cherry picked from commit 8eb40e41f6ec99985a292e342ec303a0bd6f5f41)
* [PowerPC] Fix test for PR45297 to adapt build without asserts. NFC.Kai Luo2020-04-221-1/+1
| | | | (cherry picked from commit 26b46b67d806a5299a93b1b3bca1548cb47487ff)
* [PowerPC] Enhance test for PR45297. NFC.Kai Luo2020-04-221-4/+7
| | | | (cherry picked from commit 351b19231554d4dba29c42c798176f1ff3286a32)
* [PowerPC] Pre-commit reduced test case for PR45297. NFC.Kai Luo2020-04-221-0/+10
| | | | (cherry picked from commit 70f9f4dd9d19ed2cec0d9adf60fede9401898b85)
* [PowerPC] Update alignment for ReuseLoadInfo in LowerFP_TO_INTForReuseKai Luo2020-04-162-2/+49
| | | | | | | | | | | | | In LowerFP_TO_INTForReuse, when emitting `stfiwx`, alignment of 4 is set for the `MachineMemOperand`, but RLI(ReuseLoadInfo)'s alignment is not updated for following loads. It's related to failed alignment check reported in https://bugs.llvm.org/show_bug.cgi?id=45297 Differential Revision: https://reviews.llvm.org/D77624 Backport b7d5229d789b7cb2747226d528ed016624b11cea.
* [CodeView] Align type records on 4-bytes when emitting PDBsAlexandre Ganea2020-04-165-6/+83
| | | | | | | | | | | | | | | When emitting PDBs, the TypeStreamMerger class is used to merge .debug$T records from the input .OBJ files into the output .PDB stream. Records in .OBJs are not required to be aligned on 4-bytes, and "The Netwide Assembler 2.14" generates non-aligned records. When compiling with -DLLVM_ENABLE_ASSERTIONS=ON, an assert was triggered in MergingTypeTableBuilder when non-ghash merging was used. With ghash merging there was no assert. As a result, LLD could potentially generate a non-aligned TPI stream. We now align records on 4-bytes when record indices are remapped, in TypeStreamMerger::remapIndices(). Differential Revision: https://reviews.llvm.org/D75081 (cherry picked from commit a7325298e1f311b383b8ce5ba8e2d3698fef472a)
* add release notes for ffp-model and ffp-exception-behaviorMelanie Blower2020-04-161-0/+7
| | | | (cherry picked from commit c8dadac228b7dd3a71d5fc25489d1b884a2b0f5e)
* [X86][SSE] combineX86ShufflesConstants - early out for zeroable vectors ↵Simon Pilgrim2020-04-162-1/+28
| | | | | | | | | | | | (PR45443) Shuffle combining can insert zero byte sized elements into the shuffle mask, which combineX86ShufflesConstants will attempt to fold without taking into account whether the byte-sized type is legal (e.g. AVX512F only targets). If we have a full-zeroable vector then we should just return a zero version of the root type, otherwise if the type isn't valid we should bail. Fixes PR45443 (cherry picked from commit e3b60597769f79a8abc19fb8ef1f321d9adc1358)
* [SimplifyCFG] Skip merging return blocks if it would break a CallBr.Jonas Paulsson2020-04-162-0/+43
| | | | | | | | | | | | | | | | | | SimplifyCFG should not merge empty return blocks and leave a CallBr behind with a duplicated destination since the verifier will then trigger an assert. This patch checks for this case and avoids the transformation. CodeGenPrepare has a similar check which also has a FIXME comment about why this is needed. It seems perhaps better if these two passes would eventually instead update the CallBr instruction instead of just checking and avoiding. This fixes https://bugs.llvm.org/show_bug.cgi?id=45062. Review: Craig Topper Differential Revision: https://reviews.llvm.org/D75620 (cherry picked from commit c2dafe12dc24f7f1326f5c4c6a3b23f1485f1bd6)
* [ELF] Allow SHF_LINK_ORDER and non-SHF_LINK_ORDER to be mixedFangrui Song2020-04-144-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, `error: incompatible section flags for .rodata` is reported when we mix SHF_LINK_ORDER and non-SHF_LINK_ORDER sections in an output section. This is overconstrained. This patch allows mixed flags with the requirement that SHF_LINK_ORDER sections must be contiguous. Mixing flags is used by Linux aarch64 (https://github.com/ClangBuiltLinux/linux/issues/953) .init.data : { ... KEEP(*(__patchable_function_entries)) ... } When the integrated assembler is enabled, clang's -fpatchable-function-entry=N[,M] implementation sets the SHF_LINK_ORDER flag (D72215) to fix a number of garbage collection issues. Strictly speaking, the ELF specification does not require contiguous SHF_LINK_ORDER sections but for many current uses of SHF_LINK_ORDER like .ARM.exidx/__patchable_function_entries there has been a requirement for the sections to be contiguous on top of the requirements of the ELF specification. This patch also imposes one restriction: SHF_LINK_ORDER sections cannot be separated by a symbol assignment or a BYTE command. Not allowing BYTE is a natural extension that a non-SHF_LINK_ORDER cannot be a separator. Symbol assignments can delimiter the contents of SHF_LINK_ORDER sections. Allowing SHF_LINK_ORDER sections across symbol assignments (especially __start_/__stop_) can make things hard to explain. The restriction should not be a problem for practical use cases. Reviewed By: psmith Differential Revision: https://reviews.llvm.org/D77007 (cherry picked from commit 673e81eee4fa3ffa38736f1063e6c4fa2d9278b0)
* [ELF][test] Improve linkerscript/linkorder.sFangrui Song2020-04-141-18/+48
| | | | (cherry picked from commit 2d19270efcf01672c8eaab1ccb0e5b89ea953cc9)
* [ELF][test] Rename SHF_LINK_ORDER related "metadata" to "linkorder"Fangrui Song2020-04-1411-62/+49
| | | | | | Test cleanups. (cherry picked from commit b305b8a256eade076bb13f52668a6015631ac0e5)
* Teach TreeTransform to substitute into resolved TemplateArguments.Richard Smith2020-04-142-47/+72
| | | | | | | This comes up when substituting into an already-substituted template argument during constraint satisfaction checking. (cherry picked from commit b20ab412bf838a8a87e5cc1c8c6399c3c9255354)
* [DAGCombine] Fix splitting indexed loads in ForwardStoreValueToDirectLoad()Nemanja Ivanovic2020-04-142-10/+73
| | | | | | | | | | | | | | | In DAGCombiner::visitLOAD() we perform some checks before breaking up an indexed load. However, we don't do the same checking in ForwardStoreValueToDirectLoad() which can lead to failures later during combining (see: https://bugs.llvm.org/show_bug.cgi?id=45301). This patch just adds the same checks to this function as well. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45301 Differential revision: https://reviews.llvm.org/D76778 (cherry picked from commit 482141134729237072cb94248381dab96ce34374)
* [COFF] Don't treat DWARF sections as GC rootsReid Kleckner2020-04-132-2/+64
| | | | | | | | | | | | | | DWARF sections are typically live and not COMDAT, so they would be treated as GC roots. Enabling DWARF would essentially keep all code with debug info alive, preventing any section GC. Fixes PR45273 Reviewed By: mstorsjo, MaskRay Differential Revision: https://reviews.llvm.org/D76935 (cherry picked from commit c579a5b1d92a9bc2046d00ee2d427832e0f5ddec)
* [CodeGen] Fix sinking local values in lpads with phisReid Kleckner2020-04-132-1/+52
| | | | | | | | | | There was already a test case for landingpads to handle this case, but I had forgotten to consider PHI instructions preceding the EH_LABEL in the landingpad. PR45261 (cherry picked from commit e5bf5037d869c74bc2faf81fa1f58dfd827e8356)
* Use FinishThunk to finish musttail thunksReid Kleckner2020-04-133-2/+59
| | | | | | | | | | | | | | | | | FinishThunk, and the invariant of setting and then unsetting CurCodeDecl, was added in 7f416cc42638 (2015). The invariant didn't exist when I added this musttail codepath in ab2090d10765 (2014). Recently in 28328c3771, I started using this codepath on non-Windows platforms, and users reported problems during release testing (PR44987). The issue was already present for users of EH on i686-windows-msvc, so I added a test for that case as well. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D76444 (cherry picked from commit ce5173c0e174870934d1b3a026f631d996136191)
* Add yaml defintions for CI tests with GitHub ActionsTom Stellard2020-04-133-0/+191
| | | | Imported from release/9.x branch with some additonal changes.
* Bump version to 10.0.1Tom Stellard2020-04-137-8/+8
|
* [llvm-objcopy] Improve tool selection logic to recognize llvm-strip-$major ↵Fangrui Song2020-04-112-5/+52
| | | | | | | | | | | | | | | | | as strip Debian and some other distributions install llvm-strip as llvm-strip-$major (e.g. `/usr/bin/llvm-strip-9`) D54193 made it work with llvm-strip-$major but did not add a test. The behavior was regressed by D69146. Fixes https://github.com/ClangBuiltLinux/linux/issues/940 Reviewed By: alexshap Differential Revision: https://reviews.llvm.org/D76562 (cherry picked from commit f2f96eb605bc770e4da400dbcc7a6d2526ec1fd4)
* [ELF] Fix a null pointer dereference when --emit-relocs and --strip-debug ↵Fangrui Song2020-04-104-4/+36
| | | | | | | | | | | | | | | | | | | | | | | | are used together Fixes https://bugs.llvm.org//show_bug.cgi?id=44878 When --strip-debug is specified, .debug* are removed from inputSections while .rel[a].debug* (incorrectly) remain. LinkerScript::addOrphanSections() requires the output section of a relocated InputSectionBase to be created first. .debug* are not in inputSections -> output sections .debug* are not created -> getOutputSectionName(.rel[a].debug*) dereferences a null pointer. Fix the null pointer dereference by deleting .rel[a].debug* from inputSections as well. Reviewed By: grimar, nickdesaulniers Differential Revision: https://reviews.llvm.org/D74510 (cherry picked from commit 6c73246179376442705b3a545f4e1f1478777a04)
* [CUDA] Warn about unsupported CUDA SDK version only if it's used.Artem Belevich2020-03-233-13/+30
| | | | | | | | | This fixes an issue with clang issuing a warning about unknown CUDA SDK if it's detected during non-CUDA compilation. Differential Revision: https://reviews.llvm.org/D76030 (cherry picked from commit eb2ba2ea953b5ea73cdbb598f77470bde1c6a011)
* clang/release notes: s/Subversion/git/Sylvestre Ledru2020-03-221-1/+1
|
* [Concepts] Fix incorrect control flow when TryAnnotateTypeConstraint ↵Saar Raz2020-03-197-10/+25
| | | | | | | | | | | | | annotates an invalid template-id TryAnnotateTypeConstraint could annotate a template-id which doesn't end up being a type-constraint, in which case control flow would incorrectly flow into ParseImplicitInt. Reenter the loop in this case. Enable relevant tests for C++20. This required disabling typo-correction during TryAnnotateTypeConstraint and changing a test case which is broken due to a separate bug (will be reported and handled separately). (cherry picked from commit 19fccc52ff2c1da1f93d9317c34769bd9bab8ac8)
* [Concepts] Fix incorrect DeclContext for transformed RequiresExprBodyDeclSaar Raz2020-03-192-1/+14
| | | | | | | | | We would assign the incorrect DeclContext when transforming the RequiresExprBodyDecl, causing incorrect handling of 'this' inside RequiresExprBodyDecls (bug #45162). Assign the current context as the DeclContext of the transformed decl. (cherry picked from commit 9769e1ee9acc33638449b50ac394b5ee2d4efb60)
* ../llvm/utils/update_test_checks.py --opt-binary bin/opt ↵Hans Wennborg2020-03-191-6/+24
| | | | ../llvm/test/Transforms/PhaseOrdering/min-max-abs-cse.ll
* [EarlyCSE] avoid crashing when detecting min/max/abs patterns (PR41083)Sanjay Patel2020-03-193-15/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in PR41083: https://bugs.llvm.org/show_bug.cgi?id=41083 ...we can assert/crash in EarlyCSE using the current hashing scheme and instructions with flags. ValueTracking's matchSelectPattern() may rely on overflow (nsw, etc) or other flags when detecting patterns such as min/max/abs composed of compare+select. But the value numbering / hashing mechanism used by EarlyCSE intersects those flags to allow more CSE. Several alternatives to solve this are discussed in the bug report. This patch avoids the issue by doing simple matching of min/max/abs patterns that never requires instruction flags. We give up some CSE power because of that, but that is not expected to result in much actual performance difference because InstCombine will canonicalize these patterns when possible. It even has this comment for abs/nabs: /// Canonicalize all these variants to 1 pattern. /// This makes CSE more likely. (And this patch adds PhaseOrdering tests to verify that the expected transforms are still happening in the standard optimization pipelines. I left this code to use ValueTracking's "flavor" enum values, so we don't have to change the callers' code. If we decide to go back to using the ValueTracking call (by changing the hashing algorithm instead), it should be obvious how to replace this chunk. Differential Revision: https://reviews.llvm.org/D74285 (cherry picked from commit b8ebc11f032032c7ca449f020a1fe40346e707c8)
OpenPOWER on IntegriCloud