summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Add some reduction add test cases that show sub-optimal code on avx2 ↵Craig Topper2019-08-121-0/+225
| | | | | | | | | | | | | and later. For v4i8 and v8i8 when the reduction starts with a load we end up shifting the data in the scalar domain and copying to the vector domain a second time using a broadcast. We already copied it to the vector domain once. It's better to just shuffle it there. llvm-svn: 368544
* [X86] Support -march=tigerlakePengfei Wang2019-08-129-2/+173
| | | | | | | | | | | | Support -march=tigerlake for x86. Compare with Icelake Client, It include 4 more new features ,they are avx512vp2intersect, movdiri, movdir64b, shstk. Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D65840 llvm-svn: 368543
* Fix pass dependency for LICMWenlei He2019-08-111-6/+7
| | | | | | Expected to address buildbot failure http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16285 caused by D65060. llvm-svn: 368542
* [X86] Remove redundant ';' chars ending IR lines in lit tests. NFCBjorn Pettersson2019-08-117-96/+96
| | | | | | | | | | | | | | Reviewers: RKSimon, craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66053 llvm-svn: 368541
* [SelectionDAG] Widen vector results of SMULFIX/UMULFIX/SMULFIXSATBjorn Pettersson2019-08-114-0/+142
| | | | | | | | | | | | | | | | | | | | | Summary: After the commits that changed x86 backend to widen vectors instead of using promotion some of our downstream tests started to fail. It was noticed that WidenVectorResult has been missing support for SMULFIX/UMULFIX/SMULFIXSAT. This patch adds the missing functionality. Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66051 llvm-svn: 368540
* [clang-format] Expand AllowShortBlocksOnASingleLine for WebKitOwen Pan2019-08-116-24/+116
| | | | | | | | See PR40840 Differential Revision: https://reviews.llvm.org/D66059 llvm-svn: 368539
* [X86] Simplify some of the type checks in combineSubToSubus.Craig Topper2019-08-111-5/+10
| | | | | | | If we have SSE2 we can handle any i8/i16 type and let type legalization deal with it. llvm-svn: 368538
* [X86] Don't use SplitOpsAndApply for ISD::USUBSAT.Craig Topper2019-08-111-10/+4
| | | | | | | Target independent type legalization and custom lowering should be able to handle it. llvm-svn: 368537
* [ELF] Remove redundant isDefined() in Symbol::computeBinding() and delete ↵Fangrui Song2019-08-112-5/+3
| | | | | | | | | | | | | | one redundant call site After r367869, VER_NDX_LOCAL can only be assigned to Defined and CommonSymbol. CommonSymbol becomes Defined after replaceCommonSymbols(), thus `versionId == VER_NDX_LOCAL` will imply `isDefined()`. In maybeReportUndefined(), computeBinding() is called when the symbol is unknown to be Undefined. computeBinding() != STB_LOCAL will always be true. llvm-svn: 368536
* [ELF] Remove redundant !isPreemptible in Symbol::computeBinding()Fangrui Song2019-08-111-1/+1
| | | | | | | | | | | | | | !isPreemptible was added in r343668 to fix PR39104: symbols redefined by replaceWithDefined() might be incorrectly considered STB_LOCAL if a version script specified `local: *;`. After r367869 (`config->defaultSymbolVersion` was removed), we will assign VER_NDX_LOCAL to only regular Defined and CommonSymbol, not Defined created by replaceWithDefined() (because scanVersionScript() is called before scanRelocations()). The !isPreemptible is thus redundant and can be deleted. llvm-svn: 368535
* Properly detect temporary gsl::Owners through reference initialization chains.Gabor Horvath2019-08-112-3/+10
| | | | llvm-svn: 368534
* [ELF] Remove unnecessary assignment to `used` in replaceWithDefinedFangrui Song2019-08-111-1/+0
| | | | | | | `Symbol::used` is used by Undefined and SharedSymbol to record if a .symtab entry is needed. It is of no use for Defined. llvm-svn: 368533
* [NFC][CodeGen] Use while loop instead for loop in ↵Kang Zhang2019-08-111-3/+4
| | | | | | | | MachineBlockPlacement::optimizeBranches() This will pass EXPENSIVE check. llvm-svn: 368532
* [ARM] MVE spill vector test. NFCDavid Green2019-08-111-0/+163
| | | | llvm-svn: 368531
* [MVE] Don't try to unroll vectorised MVE loopsDavid Green2019-08-112-0/+132
| | | | | | | | | | | | | | | Due to the nature of the beat system in the MVE architecture, along with tail predication and low-overhead loops, unrolling has less benefit compared to normal loops. You can not, for example, hide the latency of a load with other instructions as you can for scalar code. Preventing unrolling also makes the code easier to read and reason about. So if a loop contains vector code, don't enable the runtime unrolling. At least for the time being. Differential Revision: https://reviews.llvm.org/D65803 llvm-svn: 368530
* [ARM] Permit auto-vectorization using MVEDavid Green2019-08-112-2/+11
| | | | | | | | | | | | | With enough codegen complete, we can now correctly report the number and size of vector registers for MVE, allowing auto vectorisation. This also allows FP auto-vectorization for MVE without -Ofast/-ffast-math, due to support for IEEE FP arithmetic and parity between scalar and vector FP behaviour. Patch by David Sherwood. Differential Revision: https://reviews.llvm.org/D63728 llvm-svn: 368529
* Properly handle reference initialization when detecting gsl::Pointer ↵Gabor Horvath2019-08-112-8/+20
| | | | | | initialization chains llvm-svn: 368528
* Fix __clang_call_termiante's argument for foreign exceptionsHeejin Ahn2019-08-112-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When exceptions are repeatedly thrown in the middle of handling another exception, we call `__clang_call_terminate` with the exception pointer (i32) as an argument. But in case of foreign exceptions, we don't have the pointer, so we call the function with 0. (This requires `__clang_call_terminate` can deal with 0 argument, which will be done later) But previously the 0 argument was not added as a `i32.const 0` but an immediate by mistake, causing the `call` instruction to take not an i32 but rather an exnref, because an `exnref` is left on top of the value stack if `br_on_exn` is not taken. ``` block i32 br_on_exn 0, __cpp_exception ;; exnref is on top of stack now i32.const 0 ;; This was missing! call __clang_call_terminate unreachable end call __clang_call_terminate ;; This takes i32 extracted by br_on_exn ``` Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65475 llvm-svn: 368527
* [LICM] Make Loop ICM profile awareWenlei He2019-08-117-27/+120
| | | | | | | | | | | | | | | | | | | | | Summary: Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block. Test Plan: ninja check Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk Reviewed By: asbirlea Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65060 llvm-svn: 368526
* Revert "test commit"Wenlei He2019-08-111-2/+0
| | | | | | This reverts commit ad92a4a2769425ad0d39ac1dbb6282f6f51a1af7. llvm-svn: 368525
* test commitWenlei He2019-08-111-0/+2
| | | | llvm-svn: 368524
* [X86] Remove some more code from combineShuffle that is no longer needed ↵Craig Topper2019-08-111-47/+0
| | | | | | with widening legalization. llvm-svn: 368523
* [X86] Remove some code from combineShuffle that seems largely unnecessary ↵Craig Topper2019-08-112-62/+2
| | | | | | | | | | | with widening legalization. The test case that changed is probably better served through allowing combineTruncatedArithmetic to create narrow vectors. It also appears InstCombine would have simplified this test case to remove the zext and trunc anyway. llvm-svn: 368522
* [InstCombine][NFC] Use SimplifyAddInst() instead of ↵Roman Lebedev2019-08-101-2/+2
| | | | | | SimplifyBinOp(Instruction::BinaryOps::Add, ) llvm-svn: 368521
* [NFC][InstCombine] Tests for shift amount reassociation in bittest with ↵Roman Lebedev2019-08-101-0/+475
| | | | | | | | | | | | | truncated shl (PR42399) trunc-of-shl: https://rise4fun.com/Alive/zGx https://rise4fun.com/Alive/sl0L I.e. no extra legality check needed. https://bugs.llvm.org/show_bug.cgi?id=42399 llvm-svn: 368520
* [InstCombine] Shift amount reassociation in bittest: relax one-use check ↵Roman Lebedev2019-08-102-7/+15
| | | | | | | | | | when shifting constant If one of the values being shifted is a constant, since the new shift amount is known-constant, the new shift will end up being constant-folded so, we don't need that one-use restriction then. llvm-svn: 368519
* [InstCombine] Shift amount reassociation in bittest: drop pointless one-use ↵Roman Lebedev2019-08-102-8/+8
| | | | | | | | | | restriction That one-use restriction is not needed for correctness - we have already ensured that one of the shifts will go away, so we know we won't increase the instruction count. So there is no need for that restriction. llvm-svn: 368518
* [NFC][InstCombine] Tests for shift amount reassociation in bittest with ↵Roman Lebedev2019-08-101-10/+67
| | | | | | shift of const llvm-svn: 368517
* Add support for FreeBSD's LD_32_LIBRARY_PATHDimitry Andric2019-08-103-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Because the dynamic linker for 32-bit executables on 64-bit FreeBSD uses the environment variable `LD_32_LIBRARY_PATH` instead of `LD_LIBRARY_PATH` to find needed dynamic libraries, running the 32-bit parts of the dynamic ASan tests will fail with errors similar to: ``` ld-elf32.so.1: Shared object "libclang_rt.asan-i386.so" not found, required by "Asan-i386-inline-Dynamic-Test" ``` This adds support for setting up `LD_32_LIBRARY_PATH` for the unit and regression tests. It will likely also require a minor change to the `TestingConfig` class in `llvm/utils/lit/lit`. Reviewers: emaste, kcc, rnk, arichardson Reviewed By: arichardson Subscribers: kubamracek, krytarowski, fedor.sergeev, delcypher, #sanitizers, llvm-commits Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D65772 llvm-svn: 368516
* [X86][SSE] Lower shuffle as ANY_EXTEND_VECTOR_INREGSimon Pilgrim2019-08-104-173/+158
| | | | | | | | | | On SSE41+ targets we always lower vector shuffles to ZERO_EXTEND_VECTOR_INREG, even if we don't need the extended bits. This patch relaxes this so that we lower to ANY_EXTEND_VECTOR_INREG if we can, meaning that shuffle combines have a better idea of what elements need to be kept zero. This helps the multiple reduction code as we can now combine away a lot more of the pack+extend codes. Differential Revision: https://reviews.llvm.org/D65741 llvm-svn: 368515
* [NFC][CodeGen] Modify the PI++ to ++PI in ↵Kang Zhang2019-08-101-1/+1
| | | | | | MachineBlockPlacement::optimizeBranches() llvm-svn: 368514
* [TableGen] Correct the shift to the proper bit width.Michael Liao2019-08-102-1/+12
| | | | | | - Replace the previous 32-bit shift with 64-bit one matching `OpInit`. llvm-svn: 368513
* [Reassociate] try harder to convert negative FP constants to positiveSanjay Patel2019-08-105-159/+204
| | | | | | | | | | | | | | | | | | | | | | This is an extension of a transform that tries to produce positive floating-point constants to improve canonicalization (and hopefully lead to more reassociation and CSE). The original patches were: D4904 D5363 (rL221721) But as the test diffs show, these were limited to basic patterns by walking from an instruction to its single user rather than recursively moving up the def-use sequence. No fast-math is required here because we're only rearranging implicit FP negations in intermediate ops. A motivating bug is: https://bugs.llvm.org/show_bug.cgi?id=32939 Differential Revision: https://reviews.llvm.org/D65954 llvm-svn: 368512
* [lldb] Fix dynamic_cast by no longer failing on variable without metadataRaphael Isemann2019-08-106-7/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Our IR rewriting infrastructure currently fails when it encounters a variable which has no metadata associated. This causes dynamic_cast to fail as in this case IRForTarget considers the type info pointers ('@_ZTI...') to be variables without associated metadata. As there are no variables for these internal variables, this is actually not an error and dynamic_cast would work fine if we didn't throw this error. This patch fixes this by removing this diagnostics code. In case we would actually hit a variable that has no metadata (but is supposed to have), we still have the error in the expression log so this shouldn't make it harder to diagnose any missing metadata errors. This patch should fix dynamic_cast and also adds a bunch of test coverage to that language feature. Fixes rdar://10813639 Reviewers: davide, labath Reviewed By: labath Subscribers: friss, labath, abidh, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D65932 llvm-svn: 368511
* [clang] Fixed x86 cpuid NSC signatureRaphael Isemann2019-08-101-2/+2
| | | | | | | | | | | | | | | | | | Summary: The signature "Geode by NSC" for NSC vendor is wrong. In lib/Headers/cpuid.h, signature_NSC_edx and signature_NSC_ecx constants are inverted (cpuid signature order is ebx # edx # ecx). Reviewers: teemperor, rsmith, craig.topper Reviewed By: teemperor, craig.topper Subscribers: craig.topper, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D65978 llvm-svn: 368510
* [CodeGen] Do the Simple Early Return in block-placement pass to optimize the ↵Kang Zhang2019-08-103-16/+42
| | | | | | | | | | | | | | | | blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 368509
* [modulemap] Add AArch64SVEACLETypes.def Kristina Brooks2019-08-101-0/+1
| | | | | | Update modulemap with a new textual header. llvm-svn: 368508
* [clang-format] Add SpaceInEmptyBlock option for WebKitOwen Pan2019-08-105-33/+60
| | | | | | | | See PR40840 Differential Revision: https://reviews.llvm.org/D65925 llvm-svn: 368507
* [X86] Match the IR pattern form movmsk on SSE1 only targets where v4i32 ↵Craig Topper2019-08-102-3/+53
| | | | | | | | | | | | | | | | | | | isn't legal Summary: This patch adds a special DAG combine for SSE1 to recognize the IR pattern InstCombine gives us for movmsk. This only does the recognition for a few cases where its obvious the input won't be scalarized resulting in building a vector just do to the movmsk. I've made it separate from our existing matching for movmsk since that's called in multiple places and I didn't spend time to see if the other callers would make sense here. Plus the restrictions and additional checks would complicate that. This fixes the case from PR42870. Buts its probably still broken the presence of logic ops feeding the movmsk pattern which would further hide the v4f32 type. Reviewers: spatel, RKSimon, xbolva00 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65689 llvm-svn: 368506
* [X86] Improve the diagnostic for larger than 4-bit immediate for ↵Craig Topper2019-08-104-5/+24
| | | | | | vpermil2pd/ps. Only allow MCConstantExprs. llvm-svn: 368505
* [Sanitizer] Reenable getusershell interceptionDavid Carlier2019-08-101-1/+1
| | | | | | | | | | | | and disabling it forAndroid. Reviewers: krytarowski, vitalybuka Reviewed By: krytarowski Differential Revision: https://reviews.llvm.org/D66027 llvm-svn: 368504
* [X86] Fix stack probe issue on windows32.Luo, Yuanke2019-08-105-8/+85
| | | | | | | | | | | | | | | | | | | | | Summary: On windows if the frame size exceed 4096 bytes, compiler need to generate a call to _alloca_probe. X86CallFrameOptimization pass changes the reserved stack size and cause of stack probe function not be inserted. This patch fix the issue by detecting the call frame size, if the size exceed 4096 bytes, drop X86CallFrameOptimization. Reviewers: craig.topper, wxiao3, annita.zhang, rnk, RKSimon Reviewed By: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65923 llvm-svn: 368503
* [MemDep] allow to select block-scan-limit when constructing ↵Fedor Sergeev2019-08-102-8/+19
| | | | | | | | | | | | | MemoryDependenceAnalysis Introducing non-global control for default block-scan-limit in MemDep analysis. Useful when there are many compilations per initialized LLVM instance (e.g. JIT). Reviewed By: asbirlea Tags: #llvm Differential Revision: https://reviews.llvm.org/D65806 llvm-svn: 368502
* Fix a false positive warning when initializing members with gsl::Owners.Gabor Horvath2019-08-102-0/+20
| | | | llvm-svn: 368501
* [clangd] Disallow extraction of expression-statements.Sam McCall2019-08-095-54/+97
| | | | | | | | | | | | | | | | | | | | | Summary: I split out the "extract parent instead of this" logic from the "this isn't worth extracting" logic (now in eligibleForExtraction()), because I found it hard to reason about. While here, handle overloaded as well as builtin assignment operators. Also this uncovered a bug in getCallExpr() which I fixed. Reviewers: SureYeaah Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D65337 llvm-svn: 368500
* Attempt to reapply "Even more warnings utilizing gsl::Owner/gsl::Pointer ↵Gabor Horvath2019-08-092-10/+79
| | | | | | annotations" llvm-svn: 368499
* clangd: use -j for background index poolSam McCall2019-08-093-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: clangd supports a -j option to limit the amount of threads to use for parsing TUs. However, when using -background-index (the default in later versions of clangd), the parallelism used by clangd defaults to the hardware_parallelisn, i.e. number of physical cores. On shared hardware environments, with large projects, this can significantly affect performance with no way to tune it down. This change makes the -j parameter apply equally to parsing and background index. It's not perfect, because the total number of threads is 2x the -j value, which may still be unexpected. But at least this change allows users to prevent clangd using all CPU cores. Reviewers: kadircet, sammccall Reviewed By: sammccall Subscribers: javed.absar, jfb, sammccall, ilya-biryukov, MaskRay, jkorous, arphaman, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D66031 llvm-svn: 368498
* Small format fixHaibo Huang2019-08-091-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D66034 llvm-svn: 368497
* Detects whether RESOURCE_TYPE_IO is defined.Haibo Huang2019-08-091-0/+3
| | | | | | | | | | | | | | Summary: This fixes lldb build on macOS SDK prior to 10.12. Reviewers: JDevlieghere Subscribers: lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D66034 llvm-svn: 368496
* cfi-icall: Allow the jump table to be optionally made non-canonical.Peter Collingbourne2019-08-0922-113/+352
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default behavior of Clang's indirect function call checker will replace the address of each CFI-checked function in the output file's symbol table with the address of a jump table entry which will pass CFI checks. We refer to this as making the jump table `canonical`. This property allows code that was not compiled with ``-fsanitize=cfi-icall`` to take a CFI-valid address of a function, but it comes with a couple of caveats that are especially relevant for users of cross-DSO CFI: - There is a performance and code size overhead associated with each exported function, because each such function must have an associated jump table entry, which must be emitted even in the common case where the function is never address-taken anywhere in the program, and must be used even for direct calls between DSOs, in addition to the PLT overhead. - There is no good way to take a CFI-valid address of a function written in assembly or a language not supported by Clang. The reason is that the code generator would need to insert a jump table in order to form a CFI-valid address for assembly functions, but there is no way in general for the code generator to determine the language of the function. This may be possible with LTO in the intra-DSO case, but in the cross-DSO case the only information available is the function declaration. One possible solution is to add a C wrapper for each assembly function, but these wrappers can present a significant maintenance burden for heavy users of assembly in addition to adding runtime overhead. For these reasons, we provide the option of making the jump table non-canonical with the flag ``-fno-sanitize-cfi-canonical-jump-tables``. When the jump table is made non-canonical, symbol table entries point directly to the function body. Any instances of a function's address being taken in C will be replaced with a jump table address. This scheme does have its own caveats, however. It does end up breaking function address equality more aggressively than the default behavior, especially in cross-DSO mode which normally preserves function address equality entirely. Furthermore, it is occasionally necessary for code not compiled with ``-fsanitize=cfi-icall`` to take a function address that is valid for CFI. For example, this is necessary when a function's address is taken by assembly code and then called by CFI-checking C code. The ``__attribute__((cfi_jump_table_canonical))`` attribute may be used to make the jump table entry of a specific function canonical so that the external code will end up taking a address for the function that will pass CFI checks. Fixes PR41972. Differential Revision: https://reviews.llvm.org/D65629 llvm-svn: 368495
OpenPOWER on IntegriCloud