summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Legalize metadata in legacy testcasesAdrian Prantl2016-12-211-8/+7
| | | | llvm-svn: 290287
* Legalize metadata in legacy testcasesAdrian Prantl2016-12-211-1/+5
| | | | llvm-svn: 290286
* Legalize metadata in legacy testcasesAdrian Prantl2016-12-212-3/+8
| | | | llvm-svn: 290285
* [GlobalISel] Add basic Selector-emitter tblgen backend.Ahmed Bougacha2016-12-2110-7/+446
| | | | | | | | | | | | | | | | | This adds a basic tablegen backend that analyzes the SelectionDAG patterns to find simple ones that are eligible for GlobalISel-emission. That's similar to FastISel, with one notable difference: we're not fed ISD opcodes, so we need to map the SDNode operators to generic opcodes. That's done using GINodeEquiv in TargetGlobalISel.td. Otherwise, this is mostly boilerplate, and lots of filtering of any kind of "complicated" pattern. On AArch64, this is sufficient to match G_ADD up to s64 (to ADDWrr/ADDXrr) and G_BR (to B). Differential Revision: https://reviews.llvm.org/D26878 llvm-svn: 290284
* [AsmWriter] Remove redundant cast<>s. NFC.Ahmed Bougacha2016-12-211-2/+2
| | | | llvm-svn: 290283
* specify -DNDEBUG for BNI builds of all targets in the Xcode buildSean Callanan2016-12-211-0/+1
| | | | llvm-svn: 290282
* [WebAssembly] Fix the opcode value for i64.rotr.Dan Gohman2016-12-211-1/+1
| | | | llvm-svn: 290281
* IR: Function summary representation for type tests.Peter Collingbourne2016-12-218-10/+102
| | | | | | | | | | | Each function summary has an attached list of type identifier GUIDs. The idea is that during the regular LTO phase we would match these GUIDs to type identifiers defined by the regular LTO module and store the resolutions in a top-level "type identifier summary" (which will be implemented separately). Differential Revision: https://reviews.llvm.org/D27967 llvm-svn: 290280
* Increase the treshold in unit test to accomodate for qurantine size increase.Evgeniy Stepanov2016-12-211-1/+2
| | | | | | | | | | | | Reviewers: eugenis Patch by Alex Shlyapnikov. Subscribers: llvm-commits, kubabrecka Differential Revision: https://reviews.llvm.org/D28029 llvm-svn: 290279
* [sancov] skip duplicated pointsMike Aizatsky2016-12-211-0/+5
| | | | llvm-svn: 290278
* [sancov] hash prefix results in huge merge files, use shorter prefixMike Aizatsky2016-12-212-21/+20
| | | | llvm-svn: 290277
* Perform type-checking for a converted constant expression in a templateRichard Smith2016-12-215-11/+39
| | | | | | | | | | | argument even if the expression is value-dependent (we need to suppress the final portion of the narrowing check, but the rest of the checking can still be done eagerly). This affects template template argument validity and partial ordering under p0522r0. llvm-svn: 290276
* [AArch64] Remove a redundant check. NFC.Haicheng Wu2016-12-211-2/+1
| | | | | | | | The case AM.Scale == 0 is already handled by the code right above. Differential Revision: https://reviews.llvm.org/D28003 llvm-svn: 290275
* Add the ability for DWARFDie objects to get the parent DWARFDie.Greg Clayton2016-12-217-82/+209
| | | | | | | | | | | | In order for the llvm DWARF parser to be used in LLDB we will need to be able to get the parent of a DIE. This patch adds that functionality by changing the DWARFDebugInfoEntry class to store a depth field instead of a sibling index. Using a depth field allows us to easily calculate the sibling and the parent without increasing the size of DWARFDebugInfoEntry. I tested llvm-dsymutil on a debug version of clang where this fully parses DWARF in over 1200 .o files to verify there was no serious regression in performance. Added a full suite of unit tests to test this functionality. Differential Revision: https://reviews.llvm.org/D27995 llvm-svn: 290274
* [CMake] Support distribution install for LLDB.frameworkChris Bieneman2016-12-211-1/+11
| | | | | | | | This patch adds the last bit of support to get LLVM_DISTRIBUTION_COMPONENTS working with libLLDB when built as a framework. This patch adds dummy install targets for binaries built into the framework's Resources directory, and makes the framework's install target depend on all the binaries that get installed with the framework. llvm-svn: 290273
* Fix for the __kmpc_global_num_threads function to return the value of the ↵Andrey Churbanov2016-12-211-2/+2
| | | | | | | | | | __kmp_all_nth global var. Patch by Yonghong Yan. Differential Revision: https://reviews.llvm.org/D27975 llvm-svn: 290272
* cmake: Don't build llvm-config and tblgen concurrently in cross buildsJustin Bogner2016-12-211-1/+2
| | | | | | | | | | | | | This sets USES_TERMINAL for the native llvm-config build, so that it doesn't run at the same time as builds of other native tools (namely, tablegen). Without this, if you're very unlucky with the timing it's possible to be relinking libSupport as one of the tools is linking, causing a spurious failure. The tablegen build adopted USES_TERMINAL for this same reason in r280748. llvm-svn: 290271
* Update mailing list post URL and add libunwind referenceEd Maste2016-12-211-1/+2
| | | | | | | | | | | | | RTDyldMemoryManager.cpp describes the differing __register_frame API between libunwind and libgcc, with a mailing list posting URL. The original link was 404; replace it with what I believe is the intended post, as well as a reference to the "OS X" implementation in libunwind. Differential Revision: https://reviews.llvm.org/D27965 llvm-svn: 290269
* ARM: define a macro for the FPv5 FPU in ARM mode.Tim Northover2016-12-212-0/+3
| | | | | | | FPv5 is in Cortex-M7 and the 64-bit CPUs when running in 32-bit mode. The name is from the Cortex-M7 TRM. llvm-svn: 290268
* [X86][SSE] Improve lowering of vXi64 multiplies Simon Pilgrim2016-12-219-482/+422
| | | | | | | | | | | | | | | | | | | | | | As mentioned on PR30845, we were performing our vXi64 multiplication as: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi, 32)+ psllqi(AhiBlo, 32); when we could avoid one of the upper shifts with: AloBlo = pmuludq(a, b); AloBhi = pmuludq(a, psrlqi(b, 32)); AhiBlo = pmuludq(psrlqi(a, 32), b); return AloBlo + psllqi(AloBhi + AhiBlo, 32); This matches the lowering on gcc/icc. Differential Revision: https://reviews.llvm.org/D27756 llvm-svn: 290267
* Revert "[InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmp"David Majnemer2016-12-213-302/+2
| | | | | | This reverts commit r289813, it caused PR31449. llvm-svn: 290266
* AMDGPU/SI: Fix file headerTom Stellard2016-12-211-1/+1
| | | | llvm-svn: 290265
* TypeMetadataUtils: Simplify; spotted by Mehdi.Peter Collingbourne2016-12-211-2/+1
| | | | llvm-svn: 290264
* Add missing includes on Windows.Zachary Turner2016-12-212-0/+4
| | | | | | | Patch by Andrey Khalyavin Differential Revision: https://reviews.llvm.org/D27915 llvm-svn: 290263
* Make some diagnostic tests C++11 clean.Paul Robinson2016-12-213-7/+38
| | | | | | Differential Revision: http://reviews.llvm.org/D27794 llvm-svn: 290262
* [LLParser] Parse vector GEP constant expression correctlyMichael Kuperstein2016-12-213-4/+24
| | | | | | | | | | | The constantexpr parsing was too constrained and rejected legal vector GEPs. This relaxes it to be similar to the ones for instruction parsing. This fixes PR30816. Differential Revision: https://reviews.llvm.org/D28013 llvm-svn: 290261
* [ConstantFolding] Fix vector GEPs harderMichael Kuperstein2016-12-212-3/+27
| | | | | | | | | | For vector GEPs, CastGEPIndices can end up in an infinite recursion, because we compare the vector type to the scalar pointer type, find them different, and then try to cast a type to itself. Differential Revision: https://reviews.llvm.org/D28009 llvm-svn: 290260
* clang-format: Fix bug in handling of single-column lists.Daniel Jasper2016-12-212-8/+14
| | | | | | | | | | | | | | | | | | | Members that are themselves wrapped in fake parentheses would lead to AvoidBinPacking be set on the wrong ParenState. After: vector<int> aaaa = { aaaaaa.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaa.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, aaaaaa.aaaaaaa, aaaaaa.aaaaaaa, aaaaaa.aaaaaaa, aaaaaa.aaaaaaa, }; Before we were falling back to bin-packing these. llvm-svn: 290259
* Wdocumentation fixSimon Pilgrim2016-12-211-44/+44
| | | | llvm-svn: 290258
* [CostModel] Pass shuffle mask args with ArrayRef. NFCI.Simon Pilgrim2016-12-211-2/+2
| | | | llvm-svn: 290257
* Change the determination of parameters of macro-kernelRoman Gareev2016-12-214-102/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Typically processor architectures do not include an L3 cache, which means that Nc, the parameter of the micro-kernel, is, for all practical purposes, redundant ([1]). However, its small values can cause the redundant packing of the same elements of the matrix A, the first operand of the matrix multiplication. At the same time, big values of the parameter Nc can cause segmentation faults in case the available stack is exceeded. This patch adds an option to specify the parameter Nc as a multiple of the parameter of the micro-kernel Nr. In case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=8 it helps to improve the performance from 11.303 GFlops/sec (39,247% of theoretical peak) to 17.896 GFlops/sec (62,14% of theoretical peak). Refs.: [1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D28019 llvm-svn: 290256
* revert first commit . removing empty line in X86.hMichael Zuckerman2016-12-211-1/+0
| | | | llvm-svn: 290255
* First commit adding new line to X86.hMichael Zuckerman2016-12-211-0/+1
| | | | llvm-svn: 290254
* Align newly created arrays to the first level cache line boundaryRoman Gareev2016-12-211-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Aligning data to cache lines boundaries helps to avoid overheads related to an access to it ([1]). This patch aligns newly created arrays and adds an option to specify the first level cache line size. By default we use 64 bytes, which is a typical cache-line size ([2]). In case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=8 it helps to improve the performance from 11.303 GFlops/sec (39,247% of theoretical peak) to 12.63 GFlops/sec (43,8542% of theoretical peak). Refs.: [1] - http://www.alexonlinux.com/aligned-vs-unaligned-memory-access [2] - http://igoro.com/archive/gallery-of-processor-cache-effects/ Differential Revision: https://reviews.llvm.org/D28020 Reviewed-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 290253
* [ELF/tests] Use cpio -it instead of cpio -t.Davide Italiano2016-12-213-3/+3
| | | | | | | | | | | | | OpenBSD's cpio does not accept the -t option without -i. Apparently some systems implement cpio -t as a shortcut for cpio -it, the latter is the only thing that's documented. This change avoids test failures on OpenBSD. Patch by Mark Kettenis! Differential Revision: https://reviews.llvm.org/D28002 llvm-svn: 290252
* [Polly] Use three-dimensional arrays to store packed operands of the matrixRoman Gareev2016-12-212-31/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | multiplication Previously we had two-dimensional accesses to store packed operands of the matrix multiplication for the sake of simplicity of the packed arrays. However, addition of the third dimension helps to simplify the corresponding memory access, reduce the execution time of isl operations applied to it, and consequently reduce the compile-time of Polly. For example, in case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=7 it helps to reduce the compile-time from about 361.456 seconds to about 0.816 seconds. Reviewed-by: Michael Kruse <llvm@meinersbur.de>, Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D27878 llvm-svn: 290251
* Added a template for building target specific memory node in DAG.Elena Demikhovsky2016-12-2110-161/+445
| | | | | | | | | | I added API for creation a target specific memory node in DAG. Today, all memory nodes are common for all targets and their constructors are located in SelectionDAG.cpp. There are some cases in X86 where we need to create a special node - truncation-with-saturation store, float-to-half-store. In the current patch I added truncation-with-saturation nodes and I'm using them for intrinsics. In the future I plan to implement DAG lowering for truncation-with-saturation pattern. Differential Revision: https://reviews.llvm.org/D27899 llvm-svn: 290250
* [AMDGPU] Garbage collect dead code. NFCI.Davide Italiano2016-12-211-15/+0
| | | | llvm-svn: 290249
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-4/+4
| | | | | | Fixing a warning. llvm-svn: 290248
* [ELF] - Linkerscript: Fall back to search paths when INCLUDE not foundGeorge Rimar2016-12-212-2/+16
| | | | | | | | | | | | From https://sourceware.org/binutils/docs/ld/File-Commands.html: The file will be searched for in the current directory, and in any directory specified with the -L option. Patch done by Alexander Richardson. Differential revision: https://reviews.llvm.org/D27831 llvm-svn: 290247
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-1/+1
| | | | | | Fixing failing test. llvm-svn: 290246
* Reverting last change.Oren Ben Simhon2016-12-211-1/+1
| | | | llvm-svn: 290245
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-4/+4
| | | | | | Fixing build issues. llvm-svn: 290244
* [ELF] - Removed trailing whitespaces. NFC.George Rimar2016-12-211-1/+0
| | | | llvm-svn: 290243
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-1/+1
| | | | | | Fixing build issues. llvm-svn: 290242
* De-template DefinedSynthetic.Rui Ueyama2016-12-216-36/+20
| | | | | | | | DefinedSynthetic is not created for a real ELF object, so it doesn't have to be a template function. It has a virtual st_value, which is either 32 bit or 64 bit, but we can simply use 64 bit. llvm-svn: 290241
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-219-72/+471
| | | | | | | | | | | | | The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible. vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above. The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it. This aubmit also includes additional lit tests to cover better HVAs corner cases. Differential Revision: https://reviews.llvm.org/D27392 llvm-svn: 290240
* [ELF] - Do not call fatal() in Target.cpp, call error() instead.George Rimar2016-12-213-7/+38
| | | | | | | | | | | | | | | | | | | | | We probably would want to avoid fatal() if we can in context of librarification, but for me reason of that patch is to help D27900 go. D27900 changes errors reporting to something like error: text1 note: text2 note: text3 where hint used to provide additional information about location. In that case I can't just call fatal() because user will not see notes after that what adds additional complication to handle. So It is good to switch fatal() to error() where it is possible. Also it adds testcase with broken relocation number. Previously we did not have any, It checks that error() instead of fatal() works fine. Differential revision: https://reviews.llvm.org/D27973 llvm-svn: 290239
* [ELF] - Fix use of freed memory.George Rimar2016-12-211-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was revealed by D27831. If we have linkerscript that includes another one that sets OUTPUT for example: RUN: echo "INCLUDE \"foo.script\"" > %t.script RUN: echo "OUTPUT(\"%t.out\")" > %T/foo.script then we do: void ScriptParser::readInclude() { ... std::unique_ptr<MemoryBuffer> &MB = *MBOrErr; tokenize(MB->getMemBufferRef()); OwningMBs.push_back(std::move(MB)); } void ScriptParser::readOutput() { ... Config->OutputFile = unquote(Tok); ... } Problem is that OwningMBs are destroyed after script parser do its job. So all Toks are dead and Config->OutputFile points to destroyed data. Patch suggests to save all included scripts into using string Saver. Differential revision: https://reviews.llvm.org/D27987 llvm-svn: 290238
* [ELF][MIPS] Allow .MIPS.abiflags larger than one Elf_Mips_ABIFlags structSimon Atanasyan2016-12-213-2/+70
| | | | | | | | | | | Older versions of BFD generate libraries with .MIPS.abiflags that only concatenate the individual .MIPS.abiflags sections instead of merging. Patch by Alexander Richardson. Differential revision: https://reviews.llvm.org/D27770 llvm-svn: 290237
OpenPOWER on IntegriCloud