summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Added note to SSE4a builtins about keeping in sync with llvm testsSimon Pilgrim2016-03-101-0/+2
| | | | llvm-svn: 263116
* Updated SSSE3 builtin tests to more closely match the llvm fast-isel ↵Simon Pilgrim2016-03-101-15/+17
| | | | | | equivalent tests llvm-svn: 263115
* [CG] Back out my pointless move ctor and add the explicit templateChandler Carruth2016-03-102-1/+3
| | | | | | instantiation needed for the mingw dll build bot. llvm-svn: 263114
* Minor Wdocumentation fix. NFCI.Simon Pilgrim2016-03-101-1/+0
| | | | llvm-svn: 263113
* [SROA] Clean up some really weird code, no functionality changed.Chandler Carruth2016-03-101-3/+3
| | | | | | | | | | | We already have the instruction extracted into 'I', just cast that to a store the way we do for loads. Also, we don't enter the if unless SI is non-null, so don't test it again for null. I'm pretty sure the entire test there can be nuked, but this is just the trivial cleanup. llvm-svn: 263112
* AVX-512: Fixed a bug in i1 vector zero extending. (Skylake-avx512)Elena Demikhovsky2016-03-102-23/+71
| | | | | | | | (failed on instruction selection phase) Differential Revision: http://reviews.llvm.org/D17924 llvm-svn: 263111
* [CG] Try adding an explicit move constructor to see if that helps theChandler Carruth2016-03-101-0/+1
| | | | | | one build bot that is crashing on this code. llvm-svn: 263110
* Correcting an attribute documentation generation error by giving the abi_tag ↵Aaron Ballman2016-03-101-0/+1
| | | | | | attribute a documentation category. llvm-svn: 263109
* [AMDGPU] Fix SMEM instructions encoding/operand namingsValery Pykhtin2016-03-107-37/+153
| | | | | | Differential Revision: http://reviews.llvm.org/D17651 llvm-svn: 263108
* Revert "Track expression language from one place in ClangExpressionParser"Ewan Crawford2016-03-102-9/+9
| | | | | | r263099 seems to have broken some OSX tests llvm-svn: 263107
* [test/asan/closed-fds] Properly quote log_path for shell invocation.Filipe Cabecinhas2016-03-101-1/+1
| | | | llvm-svn: 263106
* [X86][AVX] Improve target shuffle combining of BLEND+zeroSimon Pilgrim2016-03-102-7/+4
| | | | | | | | The BLEND+zero combine was failing to combine equivalent BLEND masks. Follow up to D17483 and D17858 llvm-svn: 263105
* [CG] Add a new pass manager printer pass for the old call graph andChandler Carruth2016-03-106-1/+28
| | | | | | | | | | | | | | | | | actually finish wiring up the old call graph. There were bugs in the old call graph that hadn't been caught because it wasn't being tested. It wasn't being tested because it wasn't in the pipeline system and we didn't have a printing pass to run in tests. This fixes all of that. As for why I'm still keeping the old call graph alive its so that I can port GlobalsAA to the new pass manager with out forking it to work with the lazy call graph. That's clearly the right eventual design, but it seems pragmatic to defer that until its necessary. The old call graph works just fine for GlobalsAA. llvm-svn: 263104
* [LCG] Spell the printing pass pipeline name for the lazy call graphChandler Carruth2016-03-103-3/+3
| | | | | | | | | | 'lcg' instead of just 'cg'. This makes it consistent with the analysis name of 'lcg'. No functionality changed. llvm-svn: 263103
* [X86][SSE] Basic combining of unary target shuffles of binary target shuffles.Simon Pilgrim2016-03-103-26/+25
| | | | | | | | | | | | This patch reorders the combining of target shuffle masks so that when a unary shuffle takes a binary shuffle as its input but only references one of its inputs it can correctly combine into a unary shuffle mask. This is starting to encroach on the purpose of resolveTargetShuffleInputs, but I don't want to remove it until we definitely know we won't need it for full binary shuffle combining. There is a lot more work before we can properly support binary target shuffle masks but this was an easy case to add support for. Differential Revision: http://reviews.llvm.org/D17858 llvm-svn: 263102
* [CG] Actually hoist up the generic CallGraphPrinter pass from a weirdChandler Carruth2016-03-104-20/+28
| | | | | | | | | | location in the opt tool to live along side the analysis in LLVM's libraries. No functionality changed here, but this will allow me to port the printer to the new pass manager as well. llvm-svn: 263101
* [CG] Rename the DOT printing pass to actually reference "DOT".Chandler Carruth2016-03-105-13/+13
| | | | | | | | | | | | | There is another pass by the generic name 'CallGraphPrinter' which is actually just a call graph printer tucked away inside the opt tool. I'd like to bring it out and make it follow the same patterns as the rest of the CallGraph code, but doing so would end up conflicting with the name of the DOT printing pass. So this makes the DOT printing pass name be more precise. No functionality changed here. llvm-svn: 263100
* Track expression language from one place in ClangExpressionParserEwan Crawford2016-03-102-9/+9
| | | | | | | | | | The current expression language is currently tracked in a few places within the ClangExpressionParser constructor. This patch adds a private lldb::LanguageType attribute to the ClangExpressionParser class and tracks the expression language from that one place. Author: Luke Drummond <luke.drummond@codeplay.com> Differential Revision: http://reviews.llvm.org/D17719 llvm-svn: 263099
* Add doxygen comments to xmmintrin.h's intrinsics.Ekaterina Romanova2016-03-101-0/+943
| | | | | | | | | | Only half of the intrinsics in this file is documented here. The patch for the other half will be sent out later. The doxygen comments are automatically generated based on Sony's intrinsics document. I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream. llvm-svn: 263098
* AVX-512: Fixed a bug in shuffle for v64i8 typeElena Demikhovsky2016-03-102-0/+17
| | | | | | | | Operation SCALAR_TO_VECTOR for v64i8 and v32i16 should be lowered if BW feature is "on". Differential Revision: http://reviews.llvm.org/D17994 llvm-svn: 263097
* [opt] Fix description of the -disable-verify flagVedant Kumar2016-03-101-1/+1
| | | | llvm-svn: 263096
* Add an LLVM_BUILTIN_DEBUGTRAP macro.Mark Lacey2016-03-101-0/+17
| | | | | | | | | | | | | | Summary: This provides a macro that expands to __builtin_debugtrap() for clang, and __debugbreak() for MSVC. It intentionally expands to nothing for compilers that do not support a similar mechanism that halts the debugger without otherwise crashing the process. Differential Revision: http://reviews.llvm.org/D18002 llvm-svn: 263095
* [lto] Initialize asmparsers.Sean Silva2016-03-102-0/+12
| | | | | | | | | | | | | | | | | | Summary: They are needed for inline asm during LTO. In particular we hit the report_fatal_error on llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp:138 LLVM ERROR: Inline asm not supported by this streamer because we don't have an asm parser for this target Reviewers: ruiu, rafael Subscribers: Bigcheese, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18027 llvm-svn: 263094
* ARM: fix arm_neon_intrinsics.c and re-enable.Tim Northover2016-03-101-6882/+6469
| | | | | | | It turns out I'd never actually tested my recent change because it was gated on long-tests. Failure ensued. llvm-svn: 263093
* Add support for a preserve_most calling convention to the AArch64 backend.Roman Levenstein2016-03-106-1/+59
| | | | | | | | | | This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64. There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention. Differential Revision: http://reviews.llvm.org/D18016 llvm-svn: 263092
* Disable failing test and fix RUN line.Richard Trieu2016-03-101-2/+3
| | | | | | | | See https://llvm.org/bugs/show_bug.cgi?id=26894 for details. This change fixes the incorrect flags to Clang and the piping issue. It also disables the FileCheck portion of the test, which is currently failing. llvm-svn: 263091
* [opt] Only create Verifier passes when requestedVedant Kumar2016-03-101-1/+2
| | | | | | | | | | opt adds Verifier passes in AddOptimizationPasses even if -disable-verify is on. Fix it so that the extra verification occurs either when (1) -disable-verifier is off, or (2) -verify-each is on. Thanks to David Jones for pointing out this behavior! llvm-svn: 263090
* [SLP] Add -slp-min-reg-size command line option.Michael Zolotukhin2016-03-101-9/+21
| | | | | | | | | | MinVecRegSize is currently hardcoded to 128; this patch adds a cl::opt to allow changing it. I tried not to change any existing behavior for the default case. Differential revision: http://reviews.llvm.org/D13278 llvm-svn: 263089
* Add an entry in the Release Notes for LLVMContext::discardValueNames()Mehdi Amini2016-03-101-0/+5
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263088
* Fix false positives for for-loop-analysis warningSteven Wu2016-03-102-0/+27
| | | | | | | | | | | | | | | Summary: For PseudoObjectExpr, the DeclMatcher need to search only all the semantics but also need to search pass OpaqueValueExpr for all potential uses for the Decl. Reviewers: thakis, rtrieu, rjmccall, doug.gregor Subscribers: xazax.hun, rjmccall, doug.gregor, cfe-commits Differential Revision: http://reviews.llvm.org/D17627 llvm-svn: 263087
* Add a flag to the LLVMContext to disable name for Value other than GlobalValueMehdi Amini2016-03-1010-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is intended to be a performance flag, on the same level as clang cc1 option "--disable-free". LLVM will never initialize it by default, it will be up to the client creating the LLVMContext to request this behavior. Clang will do it by default in Release build (just like --disable-free). "opt" and "llc" can opt-in using -disable-named-value command line option. When performing LTO on llvm-tblgen, the initial merging of IR peaks at 92MB without this patch, and 86MB after this patch,setNameImpl() drops from 6.5MB to 0.5MB. The total link time goes from ~29.5s to ~27.8s. Compared to a compile-time flag (like the IRBuilder one), it performs very close. I profiled on SROA and obtain these results: 420ms with IRBuilder that preserve name 372ms with IRBuilder that strip name 375ms with IRBuilder that preserve name, and a runtime flag to strip Reviewers: chandlerc, dexonsmith, bogner Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D17946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263086
* [DWARFASTParserClang] Start with member offset of 0 for members of union types.Siva Chandra2016-03-104-1/+70
| | | | | | | | | | | | | | | Summary: GCC does not emit DW_AT_data_member_location for members of a union. Starting with a 0 value for member locations helps is reading union types in such cases. Reviewers: clayborg Subscribers: ldrumm, lldb-commits Differential Revision: http://reviews.llvm.org/D18008 llvm-svn: 263085
* [gvn] Fix more indenting and formatting in regions of code that willChandler Carruth2016-03-101-64/+62
| | | | | | | | | | | need to be changed for porting to the new pass manager. Also sink the comment on the ValueTable class back to that class instead of it dangling on an anonymous namespace. No functionality changed. llvm-svn: 263084
* [gvn] Reformat a chunk of the GVN code that is strangely indented priorChandler Carruth2016-03-101-241/+240
| | | | | | | | to restructuring it for porting to the new pass manager. No functionality changed. llvm-svn: 263083
* [PM] Port memdep to the new pass manager.Chandler Carruth2016-03-1015-140/+177
| | | | | | | | | | | | | | | | | | | | | | | This is a fairly straightforward port to the new pass manager with one exception. It removes a very questionable use of releaseMemory() in the old pass to invalidate its caches between runs on a function. I don't think this is really guaranteed to be safe. I've just used the more direct port to the new PM to address this by nuking the results object each time the pass runs. While this could cause some minor malloc traffic increase, I don't expect the compile time performance hit to be noticable, and it makes the correctness and other aspects of the pass much easier to reason about. In some cases, it may make things faster by making the sets and maps smaller with better locality. Indeed, the measurements collected by Bruno (thanks!!!) show mostly compile time improvements. There is sadly very limited testing at this point as there are only two tests of memdep, and both rely on GVN. I'll be porting GVN next and that will exercise this heavily though. Differential Revision: http://reviews.llvm.org/D17962 llvm-svn: 263082
* EmitCXXStructorCall -> EmitCXXDestructorCall. NFC.Alexey Samsonov2016-03-103-20/+18
| | | | | | | This function is only used in Microsoft ABI and only to emit destructors. Rename/simplify it accordingly. llvm-svn: 263081
* Remove unused function arguments. NFC.Alexey Samsonov2016-03-101-8/+8
| | | | llvm-svn: 263080
* Certain hardware architectures have registers of 256 bits in sizeEnrico Granata2016-03-105-29/+442
| | | | | | This patch extends Scalar such that it can support data living in such registers (e.g. float values living in the XMM registers) llvm-svn: 263079
* Fix SymbolFilePDB for discontiguous functions.Zachary Turner2016-03-101-29/+53
| | | | | | | | | Previously line table parsing code assumed that the only gaps would occur at the end of functions. In practice this isn't true, so this patch makes the line table parsing more robust in the face of functions with non-contiguous byte arrangements. llvm-svn: 263078
* sanitizer: Fix endianness checks for gccAlexey Samsonov2016-03-094-6/+6
| | | | | | | | | | | | | | | | | Summary: __BIG_ENDIAN__ and __LITTLE_ENDIAN__ are not supported by gcc, which eg. for ubsan Value::getFloatValue will silently fall through to the little endian branch, breaking display of float values by ubsan. Use __BYTE_ORDER__ == __ORDER_BIG/LITTLE_ENDIAN__ as the condition instead, which is supported by both clang and gcc. Noticed while porting ubsan to s390x. Patch by Marcin Kościelnicki! Differential Revision: http://reviews.llvm.org/D17660 llvm-svn: 263077
* [Modules] Add stdatomic to the list of builtin headersBen Langmuir2016-03-091-0/+1
| | | | | | | | | | | | | Since it's provided by the compiler. This allows a system module map file to declare a module for it. No test change for cstd.m, since stdatomic.h doesn't function without a relatively complete stdint.h and stddef.h, which tests using this module don't provide. rdar://problem/24931246 llvm-svn: 263076
* [BasicAA/MDA] Sink aliasing rules for malloc and calloc into BasicAAPhilip Reames2016-03-092-16/+17
| | | | | | | | | | MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA. In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice. Differential Revision: http://reviews.llvm.org/D15912 llvm-svn: 263075
* [CGP] Duplicate addressing computation in cold paths if required to sink ↵Philip Reames2016-03-092-8/+241
| | | | | | | | | | | | | | addressing mode This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store. In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path. This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had. Differential Revision: http://reviews.llvm.org/D17652 llvm-svn: 263074
* Fix the buildPhilip Reames2016-03-091-0/+1
| | | | | | I screwed up rebasing 263072. This change fixes the build and passes all make check. llvm-svn: 263073
* [LICM] Store promotion when memory is thread localPhilip Reames2016-03-093-13/+191
| | | | | | | | | | | | This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop. Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped. In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem. Differential Revision: http://reviews.llvm.org/D16783 llvm-svn: 263072
* Use %t as the output file name to avoid repetition. NFC.Sean Silva2016-03-091-5/+5
| | | | llvm-svn: 263071
* [lto] Add saving the LTO .o file to -save-temps.Sean Silva2016-03-092-3/+15
| | | | | | | | | | | | | | | | Summary: This implements another part of -save-temps. After this, the only remaining part is dumping the optimized bitcode. But currently LLD's LTO doesn't have a non-intrusive place to put this. Eventually we probably will and it will make sense to add it then. Reviewers: ruiu, rafael Subscribers: joker.eph, Bigcheese, llvm-commits Differential Revision: http://reviews.llvm.org/D18009 llvm-svn: 263070
* [x86] fix cost model inaccuracy for vector memory opsSanjay Patel2016-03-092-6/+6
| | | | | | | | | | | The irony of this patch is that one CPU that is affected is AMD Jaguar, and Jaguar has a completely double-pumped AVX implementation. But getting the cost model to reflect that is a much bigger problem. The small goal here is simply to improve on the lie that !AVX2 == SandyBridge. Differential Revision: http://reviews.llvm.org/D18000 llvm-svn: 263069
* [WebAssembly] Update known gcc test failuresDerek Schuff2016-03-091-3/+0
| | | | llvm-svn: 263068
* [x86, AVX] optimize masked loads with constant masksSanjay Patel2016-03-092-25/+121
| | | | | | | | Instead of a variable-blend instruction, form a blend with immediate because those are always cheaper. Differential Revision: http://reviews.llvm.org/D17899 llvm-svn: 263067
OpenPOWER on IntegriCloud