summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [MC][ARM] Delete MCSection::HasData and move SHF_ARM_PURECODE logic to ↵Fangrui Song2020-01-054-17/+6
| | | | | | | | ARMELFObjectWriter::addTargetSectionFlags This simplifies the generic interface and also makes SHF_ARM_PURECODE more robust (fixes a TODO). Inspecting MCDataFragment contents covers more cases than MCObjectStreamer::EmitBytes.
* Add missing testStephen Kelly2020-01-051-0/+12
|
* [Gnu toolchain] Look at standard GCC paths for libstdcxx by defaultKristina Brooks2020-01-056-86/+121
| | | | | | | | | | | Linux' current addLibCxxIncludePaths and addLibStdCxxIncludePaths are actually almost non-Linux-specific at all, and can be reused almost as such for all gcc toolchains. Only keep Android/Freescale/Cray hacks in Linux's version. Patch by sthibaul (Samuel Thibault) Differential Revision: https://reviews.llvm.org/D69758
* [MC] Delete MCSection::{rbegin,rend}Fangrui Song2020-01-052-8/+2
|
* Allow using traverse() with bindingsStephen Kelly2020-01-053-0/+38
|
* Fix oversight in AST traversal helperStephen Kelly2020-01-051-1/+1
|
* [MC] Merge MCSymbol::getSectionPtr into getSection and simplifyFangrui Song2020-01-051-9/+1
|
* [MC] Drop an unused rule about absolute temporary symbolsFangrui Song2020-01-051-4/+0
|
* [X86][SSE] Combine combineLogicBlendIntoConditionalNegate for VSELECT nodes ↵Simon Pilgrim2020-01-053-139/+117
| | | | | | | | (PR43660) Attempt to use combineLogicBlendIntoConditionalNegate for (select M, (sub 0, X), X) -> (sub (xor X, M), M) We limit this to cases that can't easily replace the VSELECT with a shuffle (non-constant masks) or where a BLENDV is likely to occur (which tends to result in slower codegen).
* [X86] Move combineLogicBlendIntoConditionalNegate before combineSelect. NFCI.Simon Pilgrim2020-01-051-62/+62
| | | | Updates function order in preparation of future fix for PR43660
* [X86] Merge (identical) LowerGC_TRANSITION_START and LowerGC_TRANSITION_END ↵Simon Pilgrim2020-01-052-27/+4
| | | | | | (NFC) Silences a copy+paste analyzer warning - all they are doing are inserting NOOPs in exactly the same way.
* [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectorsDavid Green2020-01-0510-25/+52
| | | | | | | | | | | This adds extra scalar handling to isFMAFasterThanFMulAndFAdd, allowing the target independent code to handle more folds in more situations (for example if the fast math flags are present, but the global AllowFPOpFusion option isnt). It also splits apart the HasSlowFPVMLx into HasSlowFPVFMx, to allow VFMA and VMLA to be controlled separately if needed. Differential Revision: https://reviews.llvm.org/D72139
* [ARM] Fill in FP16 FMA patternsDavid Green2020-01-052-64/+65
| | | | | | This adds fp16 variants of all the fma patterns in the ARM backend. Differential Revision: https://reviews.llvm.org/D72138
* [ARM] Add and update FMA tests. NFCDavid Green2020-01-053-31/+480
|
* [ParserTest] Move raw string literal out of macroDavid Green2020-01-051-3/+3
| | | | | Some combinations of gcc and ccache do not deal well with raw strings in macros. Moving the string out to attempt to fix the bots.
* [LegalizeVectorOps][X86] Enable expansion of vector fp_to_uint in ↵Craig Topper2020-01-044-258/+113
| | | | | | | | | | | | | LegalizeVectorOps to avoid scalarization. The code here isn't great in all caess. Particularly v4f64->v4i32 on 64-bit AVX targets. But there is some improvement in some configurations. There's definitely some issues with computeNumSignBits with X86ISD::STRICT_FCMP. As well as not being able to propagate sign bits through merge_values nodes that get created during custom legalization.
* [TargetLowering] In expandFP_TO_UINT, add proper extend or truncate for the ↵Craig Topper2020-01-041-0/+4
| | | | | | | | | | | | condition to feed the DstVT select. Previously, for vectors we created a vselect with a condition that didn't match what the target wanted according to getSetCCResultType. To make up for this, X86 had a special DAG combine to detect if the condition was all sign bits and then insert its own truncate or extend. By adding the extend/truncate here explicitly we can avoid that.
* [LegalizeVectorOps] Split most of ExpandStrictFPOp into a separate ↵Craig Topper2020-01-041-6/+13
| | | | | | | | | | | UnrollStrictFPOp method. Call that method from ExpandUINT_TO_FLOAT. ExpandStrictFPOp calls ExpandUINT_TO_FLOAT. Previously, ExpandUINT_TO_FLOAT returned SDValue() if it wasn't able to handle and needed to unroll. Then ExpandStrictFPOp would detect his SDValue() and do the unroll. After this change, ExpandUINT_TO_FLOAT will directly call UnrollStrictFPOp and return the unrolled result.
* [ELF] Drop const qualifier to fix -Wrange-loop-analysis. NFCFangrui Song2020-01-041-1/+1
| | | | | | | | | | | | | | | | ``` lld/ELF/Relocations.cpp:1622:56: warning: loop variable 'ts' of type 'const std::pair<ThunkSection *, uint32_t>' (aka 'const pair<lld::elf::ThunkSection *, unsigned int>') creates a copy from type 'const std::pair<ThunkSection *, uint32_t>' [-Wrange-loop-analysis] for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections) ``` Drop const qualifier to fix -Wrange-loop-analysis. We can make -Wrange-loop-analysis warnings (DiagnoseForRangeConstVariableCopies) on `const A` more permissive on more types (e.g. POD -> trivially copyable), unfortunately it will not make std::pair good, because `constexpr pair& operator=(const pair& p);` is unfortunately user-defined. Reviewed By: Mordante Differential Revision: https://reviews.llvm.org/D72211
* GlobalISel: Scalarize all division operationsMatt Arsenault2020-01-046-0/+1742
| | | | | | This only handled G_SDIV, but they all are trivially scalarizable. Also define placeholder AMDGPU division legalizer rules.
* Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."Florian Hahn2020-01-0425-946/+858
| | | | | This reverts commit 51ef53f3bd23559203fe9af82ff2facbfedc1db3, as it breaks some bots.
* [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).Florian Hahn2020-01-0425-858/+946
| | | | | | | | | | | | SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537
* [SCEV] Remove unused ScalarEvolutionExpander.h includes (NFC).Florian Hahn2020-01-044-4/+0
|
* GlobalISel: Define G_READCYCLECOUNTERMatt Arsenault2020-01-046-0/+27
|
* AMDGPU/GlobalISel: Refine SMRD selection rulesMatt Arsenault2020-01-043-36/+172
| | | | | Fix selecting these for volatile global loads, and ensure the loads are constant enough.
* AMDGPU/GlobalISel: Legalize more odd sized loadsMatt Arsenault2020-01-045-307/+43
| | | | | The attempts to widen sufficently aligned, odd sized loads wasn't consistently applied.
* AMDGPU/GlobalISel: Assume vcc phis for any vcc inputMatt Arsenault2020-01-043-86/+69
| | | | | | | This produces more intelligible looking results, more comparabble to the DAG output in the simplest cases. This is probably wrong in complex control flow, but RegBankSelect doesn't attempt analyzing if this is on a masked path for selecting the bank yet.
* [Pass Registration] XFAIL load_extension.ll test on macOS.Florian Hahn2020-01-041-0/+3
| | | | | | | | | | | This test fails on macOS, causing the following bots to fail http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/7438/ http://green.lab.llvm.org/green/job/clang-stage1-RA/5034/ Error: Error opening 'build/./lib/libBye.dylib': dlopen(build/./lib/libBye.dylib, 9): image not found -load request ignored.
* AMDGPU/GlobalISel: Implement applyMappingImpl less incorrectlyMatt Arsenault2020-01-041-13/+23
| | | | | | | | | | | We're checking the current register bank of the registers in the instruction, but the mapping may have inserted cross bank copies and is expecting to replace the registers. We mostly get away with this currently, because VGPR->SGPR copies are illegal, and we assume this won't happen. In a future change, we'll start relying on more cross register bank copies being inserted, and this starts to break down.
* [cmake] Remove install from add_llvm_example_library.Florian Hahn2020-01-041-4/+3
| | | | | This should fix http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/30086
* Re-apply "[Examples] Add IRTransformations directory to examples."Florian Hahn2020-01-0417-0/+1004
| | | | | | This reverts commit 19fd8925a4afe6efd248688cce06aceff50efe0c. Should include a fix for PR44197.
* NFC: Fix trivial typos in commentsKazuaki Ishizaki2020-01-0468-80/+80
|
* [AMDGPU] need to insert wait between the scalar load and vector store to the ↵alex-t2020-01-042-0/+50
| | | | | | | | | | same address to avoid WAR conflict. Reviewers: rampitec, vpykhtin, nhaehnle Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D71934
* [NFCI][InstCombine] Refactor 'sink negation into select if that folds one ↵Roman Lebedev2020-01-041-40/+35
| | | | | | | | | | hand of select to 0' fold I would think it's better than having two practically identical folds next to eachother, but then generalization isn't all that pretty due to the fact that we need to produce different `sub` each time.. This change is no-functional-changes-intended refactoring.
* [InstCombine] Sink sub into hands of select if one hand becomes zero. Part 2 ↵Roman Lebedev2020-01-043-22/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | (PR44426) This decreases use count of %Op0, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal) %Op0 = %TrueVal %o = select i1 %Cond, i8 %Op0, i8 %FalseVal %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %FalseVal %r = select i1 %Cond, i8 0, i8 %n Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0 %Op0 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op0 %r = sub i8 %Op0, %o => %n = sub i8 %Op0, %TrueVal %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/aHRt https://bugs.llvm.org/show_bug.cgi?id=44426
* [NFC][InstCombine] 'subtract from one hands of select' pattern tests (PR44426)Roman Lebedev2020-01-042-11/+78
| | | | https://bugs.llvm.org/show_bug.cgi?id=44426
* [InstCombine] Sink sub into hands of select if one hand becomes zero (PR44426)Roman Lebedev2020-01-043-22/+43
| | | | | | | | | | | | | | | | | | | | | | | | | This decreases use count of %Op1, makes one hand of select to be 0, and possibly exposes further folding potential. Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1) %Op1 = %TrueVal %o = select i1 %Cond, i8 %Op1, i8 %FalseVal %r = sub i8 %o, %Op1 => %n = sub i8 %FalseVal, %Op1 %r = select i1 %Cond, i8 0, i8 %n Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0 %Op1 = %FalseVal %o = select i1 %Cond, i8 %TrueVal, i8 %Op1 %r = sub i8 %o, %Op1 => %n = sub i8 %TrueVal, %Op1 %r = select i1 %Cond, i8 %n, i8 0 https://rise4fun.com/Alive/avL https://bugs.llvm.org/show_bug.cgi?id=44426
* [NFC][InstCombine] 'subtract of one hands of select' pattern tests (PR44426)Roman Lebedev2020-01-041-0/+89
| | | | https://bugs.llvm.org/show_bug.cgi?id=44426
* [Transforms][GlobalSRA] huge array causes long compilation time and huge ↵Alexey Lapshin2020-01-042-65/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory usage. Summary: For artificial cases (huge array, few usages), Global SRA optimization creates a lot of redundant data. It creates an instance of GlobalVariable for each array element. For huge array, that means huge compilation time and huge memory usage. Following example compiles for 10 minutes and requires 40GB of memory. namespace { char LargeBuffer[64 * 1024 * 1024]; } int main ( void ) { LargeBuffer[0] = 0; printf("\n "); return LargeBuffer[0] == 0; } The fix is to avoid Global SRA for large arrays. Reviewers: craig.topper, rnk, efriedma, fhahn Reviewed By: rnk Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71993
* [TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits ↵Simon Pilgrim2020-01-0412-150/+151
| | | | | | | | | | | | | | for ISD::EXTRACT_VECTOR_ELT (REAPPLIED) This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0. Differential Revision: https://reviews.llvm.org/D65887
* [LLD] [COFF] Don't error out on duplicate absolute symbols with the same valueMartin Storsjö2020-01-044-3/+31
| | | | | | | | | Both MS link.exe and GNU ld.bfd handle it this way; one can have multiple object files defining the same absolute symbols, as long as it defines it to the same value. But if there are multiple absolute symbols with differing values, it is treated as an error. Differential Revision: https://reviews.llvm.org/D71981
* [X86] Update MaxIndex test in x86-cmov-converter.ll to return the index and ↵Craig Topper2020-01-031-16/+10
| | | | | | | | | | | | not use the index to look up the array after the loop. This represents a more realistic version of the code being tested. The cmov converter doesn't look at the code after the loop so it doesn't matter for what's being tested. But as noted in this twitter thread https://twitter.com/trav_downs/status/1213311159413161987 gcc can turn the previous MaxIndex code into the MaxValue code. So returning the index makes it a distinct case.
* [OpenMP] NFC: Fix trivial typos in commentsKelvin Li2020-01-0325-36/+36
| | | | | | Submitted by: kiszk Differential Revision: https://reviews.llvm.org/D72171
* [gn build] Port 5d304d68dd5LLVM GN Syncbot2020-01-041-1/+0
|
* Revert "[gicombiner] Add GIMatchTree and use it for the code generation"Daniel Sanders2020-01-0313-1928/+17
| | | | | | | | | All the windows bots are failing match-tree.td and there's no obvious cause that I can see. It's not just the %p formatting problem. My best guess is that there's an ordering issue too but I'll need further information to figure that out. Revert while I'm investigating. This reverts commit 64f1bb5cd2c6d69af7c74ec68840029603560238 and 77d4b5f5feff663e70b347516cc4c77fa5cd2a20
* [lldb/Command] Add --force option for `watchpoint delete` commandMed Ismail Bennani2020-01-043-39/+94
| | | | | | | | | | | | | | Currently, there is no option to delete all the watchpoint without LLDB asking for a confirmation. Besides making the watchpoint delete command homogeneous with the breakpoint delete command, this option could also become handy to trigger automated watchpoint deletion i.e. using breakpoint actions. rdar://42560586 Differential Revision: https://reviews.llvm.org/D72096 Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
* [X86] Autogenerate complete checks. NFCCraig Topper2020-01-033-1061/+5484
|
* [Remarks] Warn if a remark file is not found when processing static archivesFrancis Visoiu Mistrih2020-01-036-3/+82
| | | | | | | | | | | Static archives contain object files which contain sections pointing to external remark files. When static archives are shipped without the remark files, dsymutil shouldn't generate an error. Instead, generate a warning to inform the user that remarks for that library won't be available in the .dSYM.
* [UserExpression] Clean up `return` after `else`.Davide Italiano2020-01-031-4/+3
|
* [gicombiner] Correct 64f1bb5cd2c to account for MSVC's %p formatDaniel Sanders2020-01-031-20/+20
|
OpenPOWER on IntegriCloud