summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Combine some of the strings in autoupgrade code.Craig Topper2016-09-031-35/+7
| | | | llvm-svn: 280603
* Cleanup : Use metadata preserving API for branch creationXinliang David Li2016-09-031-9/+4
| | | | | | | Use the wrapper API in IRBuilder that does meta data copy to create new branch in LoopUnswitch. llvm-svn: 280602
* ScopInfo: Do not derive assumptions from all GEP pointer instructionsTobias Grosser2016-09-032-107/+0
| | | | | | | | | | | | | | | | | ... but instead rely on the assumptions that we derive for load/store instructions. Before we were able to delinearize arrays, we used GEP pointer instructions to derive information about the likely range of induction variables, which gave us more freedom during loop scheduling. Today, this is not needed any more as we delinearize multi-dimensional memory accesses and as part of this process also "assume" that all accesses to these arrays remain inbounds. The old derive-assumptions-from-GEP code has consequently become mostly redundant. We drop it both to clean up our code, but also to improve compile time. This change reduces the scop construction time for 3mm in no-asserts mode on my machine from 48 to 37 ms. llvm-svn: 280601
* [Profile] preserve branch metadata lowering select in CGPXinliang David Li2016-09-034-8/+42
| | | | | | | | | | CGP currently drops select's MD_prof profile data when generating conditional branch which can lead to bad code layout. The patch fixes the issue. Differential Revision: http://reviews.llvm.org/D24169 llvm-svn: 280600
* Fix ThinLTO crash with debug infoMehdi Amini2016-09-034-0/+87
| | | | | | | | | | | | Because the recent change about ODR type uniquing in the context, we can reach types defined in another module during IR linking. This triggered some assertions in case we IR link without starting from an empty module. To alleviate that, we can self-map metadata defined in the destination module so that they won't be visited. Differential Revision: https://reviews.llvm.org/D23841 llvm-svn: 280599
* Strip trailing whitespaceSimon Pilgrim2016-09-031-2/+2
| | | | llvm-svn: 280598
* [AVX-512] Remove masked integer mullo builtins and replace with native IR.Craig Topper2016-09-0313-126/+108
| | | | llvm-svn: 280597
* [AVX-512] Remove masked integer add/sub builtins and replace with native IR.Craig Topper2016-09-039-341/+290
| | | | llvm-svn: 280596
* AMDGPU: Set sizes of spill pseudosMatt Arsenault2016-09-033-3/+13
| | | | llvm-svn: 280595
* AMDGPU: Fix adding duplicate implicit exec usesMatt Arsenault2016-09-031-1/+15
| | | | | | | | I'm not sure if this should be considered a bug in copyImplicitOps or not, but implicit operands that are part of the static instruction definition should not be copied. llvm-svn: 280594
* [AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an ↵Craig Topper2016-09-033-0/+498
| | | | | | AVX512 stack folding test. llvm-svn: 280593
* [AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE ↵Craig Topper2016-09-031-21/+23
| | | | | | equivalent. llvm-svn: 280592
* Fix the attribute documentation build.Aaron Ballman2016-09-031-0/+1
| | | | llvm-svn: 280591
* AMDGPU: Reduce the duration of whole-quad-modeNicolai Haehnle2016-09-032-82/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This contains two changes that reduce the time spent in WQM, with the intention of reducing bandwidth required by VMEM loads: 1. Sampling instructions by themselves don't need to run in WQM, only their coordinate inputs need it (unless of course there is a dependent sampling instruction). The initial scanInstructions step is modified accordingly. 2. When switching back from WQM to Exact, switch back as soon as possible. This affects the logic in processBlock. This should always be a win or at best neutral. There are also some cleanups (e.g. remove unused ExecExports) and some new debugging output. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D22092 llvm-svn: 280590
* AMDGPU: Fix an interaction between WQM and polygon stipplingNicolai Haehnle2016-09-033-11/+46
| | | | | | | | | | | | | | | | | | | | | Summary: This fixes a rare bug in polygon stippling with non-monolithic pixel shaders. The underlying problem is as follows: the prolog part contains the polygon stippling sequence, i.e. a kill. The main part then enables WQM based on the _reduced_ exec mask, effectively undoing most of the polygon stippling. Since we cannot know whether polygon stippling will be used, the main part of a non-monolithic shader must always return to exact mode to fix this problem. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23131 llvm-svn: 280589
* Fix PR30202 - notify_all_at_thread_exit seg faults if run from a raw pthread ↵Eric Fiselier2016-09-033-16/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | context. Summary: This patch allows threads not created using `std::thread` to use `std::notify_all_at_thread_exit` by ensuring the TL state has been initialized within `std::notify_all_at_thread_exit`. Additionally this patch "fixes" a potential oddity in `__thread_local_pointer::reset(pointer)`, which would previously delete the old thread local data. However there should *never* be old thread local data because pthread *should* null it out on thread exit. Unfortunately it's possible that pthread failed to do this according to the spec: > > Upon key creation, the value NULL shall be associated with the new key in all active threads. Upon thread creation, the value NULL shall be associated with all defined keys in the new thread. > > An optional destructor function may be associated with each key value. At thread exit, if a key value has a non-NULL destructor pointer, and the thread has a non-NULL value associated with that key, the value of the key is set to NULL, and then the function pointed to is called with the previously associated value as its sole argument. The order of destructor calls is unspecified if more than one destructor exists for a thread when it exits. > > If, after all the destructors have been called for all non-NULL values with associated destructors, there are still some non-NULL values with associated destructors, then the process is repeated. If, after at least {PTHREAD_DESTRUCTOR_ITERATIONS} iterations of destructor calls for outstanding non-NULL values, there are still some non-NULL values with associated destructors, implementations may stop calling destructors, or they may continue calling destructors until no non-NULL values with associated destructors exist, even though this might result in an infinite loop. However if pthread fails to delete the value it is probably incorrect for us to do it. Destroying the value performs all of the "at thread exit" actions registered with it but we are way past "at thread exit". Reviewers: mclow.lists, bcraig, EricWF Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D24159 llvm-svn: 280588
* Replace the Radeon GCN GPU family names by more descriptive onesNiels Ole Salscheider2016-09-031-25/+25
| | | | | | Differential Revision: https://reviews.llvm.org/D23957 llvm-svn: 280587
* AMDGPU: Do basic folding of class intrinsicMatt Arsenault2016-09-032-0/+316
| | | | | | | This allows more of the OCML builtin library to be constant folded. llvm-svn: 280586
* memory_resource still needs init_priority when built with GCC 4.9Eric Fiselier2016-09-031-3/+1
| | | | llvm-svn: 280585
* AMDGPU: Fix spilling of m0Matt Arsenault2016-09-035-55/+122
| | | | | | | | | readlane/writelane do not support using m0 as the output/input. Constrain the register class of spill vregs to try to avoid this, but also handle spilling of the physreg when necessary by inserting an additional copy to a normal SGPR. llvm-svn: 280584
* Improve debug error message with register nameMatt Arsenault2016-09-031-1/+2
| | | | llvm-svn: 280583
* [AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables.Craig Topper2016-09-032-0/+72
| | | | llvm-svn: 280581
* Add a test Aaron asked for that I forgot to add before landing r280578.Nico Weber2016-09-031-0/+2
| | | | llvm-svn: 280580
* Make lit/util.py py3-compatible.NAKAMURA Takumi2016-09-031-1/+1
| | | | llvm-svn: 280579
* [ms] Add support for parsing uuid as a Microsoft attribute.Nico Weber2016-09-037-2/+261
| | | | | | | | | | | | | | | | | Some Windows SDK classes, for example Windows::Storage::Streams::IBufferByteAccess, use the ATL way of spelling attributes: [uuid("....")] class IBufferByteAccess {}; To be able to use __uuidof() to grab the uuid off these types, clang needs to support uuid as a Microsoft attribute. There was already code to skip Microsoft attributes, extend that to look for uuid and parse it. Use the new "Microsoft" attribute type added in r280575 (and r280574, r280576) for this. Final part of https://reviews.llvm.org/D23895 llvm-svn: 280578
* Revert r280549.Nico Weber2016-09-033-536/+483
| | | | | | | | | | | | | | | | | | | The test it added doesn't pass: http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15318/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Apdbdump-yaml-types.test Command Output (stdout): -- $ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\llvm-pdbdump.EXE" "pdb2yaml" "-tpi-stream" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB/Inputs/empty.pdb" $ "D:/buildslave/clang-x64-ninja-win7/stage1/./bin\FileCheck.EXE" "-check-prefix=YAML" "D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test" # command stderr: D:\buildslave\clang-x64-ninja-win7\llvm\test\DebugInfo\PDB\pdbdump-yaml-types.test:36:7: error: expected string not found in input YAML: Name: apartment ^ <stdin>:153:10: note: scanning from here Value: 161 ^ llvm-svn: 280577
* Let Microsoft attributes apply to the type, not the variable.Nico Weber2016-09-033-11/+15
| | | | | | | | | | | | | There was already a function that moved attributes off the declspec into an attribute list for attributes applying to the type, teach that function to also move Microsoft attributes around and rename it to match its new broader role. Nothing uses Microsoft attributes yet, so no behavior change. Part of https://reviews.llvm.org/D23895 llvm-svn: 280576
* Add plumbing for new attribute type "Microsoft".Nico Weber2016-09-033-7/+28
| | | | | | | | This is for attributes in []-delimited lists preceding a class, like e.g. `[uuid("...")] class Foo {};` Not used by anything yet, so no behavior change. Part of https://reviews.llvm.org/D23895 llvm-svn: 280575
* Move calls of MaybeParseMicrosoftAttributes() before ParseExternalDeclaration()Nico Weber2016-09-034-8/+1
| | | | | | | | into ParseDeclOrFunctionDefInternal() (which is called by MaybeParseMicrosoftAttributes()), so that the attributes can be stored in the DeclSpec. No behavior change yet, part of https://reviews.llvm.org/D23895 llvm-svn: 280574
* ADT: Use std::list in SparseBitVector, NFCDuncan P. N. Exon Smith2016-09-031-34/+13
| | | | | | | | | | | | | The only intrusive thing about SparseBitVector's usage of ilist<> was that new was usually called externally. There were no custom traits. It seems like the reason to switch to ilist in r41855 was to avoid pointer invalidation, but std::list<> has that feature too. Maybe std::list<>::emplace makes this a little more obvious than it was then. Switch over to std::list<> and simplify the code. llvm-svn: 280573
* Remove function name from comment.Nico Weber2016-09-031-6/+5
| | | | | | | | | The comment starting with "ParseDeclarationOrFunctionDefinition -" is above a function called ParseDeclOrFunctionDefInternal. Fix the comment by not mentioning a function name, like the style guide requests nowadays. No behavior change. llvm-svn: 280572
* [PowerPC] Support asm parsing for bc[l][a][+-] mnemonicsHal Finkel2016-09-036-0/+111
| | | | | | | | | | | | | | | | | | | | PowerPC assembly code in the wild, so it seems, has things like this: bc+ 12, 28, .L9 This is a bit odd because the '+' here becomes part of the BO field, and the BO field is otherwise the first operand. Nevertheless, the ISA specification does clearly say that the +- hint syntax applies to all conditional-branch mnemonics (that test either CTR or a condition register, although not the forms which check both), both basic and extended, so this is supposed to be valid. This introduces some asm-parser-only definitions which take only the upper three bits from the specified BO value, and the lower two bits are implied by the +- suffix (via some associated aliases). Fixes PR23646. llvm-svn: 280571
* ADT: Do not inherit from std::iterator in ilist_iteratorDuncan P. N. Exon Smith2016-09-033-19/+13
| | | | | | | | | | | | | Inheriting from std::iterator uses more boiler-plate than manual typedefs. Avoid that in both ilist_iterator and MachineInstrBundleIterator. This has the side effect of removing ilist_iterator from certain ADL lookups in namespace std; calls to std::next need to be qualified by "std::" that didn't have to before. The one case of this in-tree was operating on a temporary, so I used the more compact operator++. llvm-svn: 280570
* ADT: Split out iplist_impl from iplist, NFCDuncan P. N. Exon Smith2016-09-032-21/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split out iplist_impl from iplist, and change SymbolTableList to inherit directly from iplist_impl. This makes it more straightforward to add new template paramaters to iplist [*]: - iplist_impl takes a "base" list that provides the intrusive functionality (usually simple_ilist<T>) and a traits class. - iplist no longer takes a "Traits" template parameter. It only takes the value_type, T, and instantiates iplist_impl with simple_ilist<T> and ilist_traits<T>. - SymbolTableList now inherits from iplist_impl, instead of iplist. Note for out-of-tree code: if you have an iplist whose second template parameter was *not* the default (i.e., not ilist_traits<YourT>), you have three options: - Stop using a custom traits class, and instead specialize ilist_traits<YourT>. This is the usual thing to do. - Specialize iplist<YourT> to pass your custom traits class into iplist_impl. - Create your own trivial list type that passes your custom traits class into iplist_impl (see SymbolTableList<> for an example). [*]: The eventual goal is to start tracking a sentinel bit on the MachineInstr list even when LLVM_ENABLE_ABI_BREAKING_CHECKS is off, which will enable MachineBasicBlock::reverse_iterator to have normal list invalidation semantics that matching the new iplist<>::reverse_iterator from r280032. llvm-svn: 280569
* Fix buildbot error.Wei Mi2016-09-031-4/+1
| | | | | | Add -mtriple=x86_64-unknown-linux-gnu for the test and move it to CodeGen/X86. llvm-svn: 280568
* ADT: Rename NodeTy to T in iplist/ilist template parametersDuncan P. N. Exon Smith2016-09-031-54/+59
| | | | | | And use other typedefs so that the next rename has a smaller diff. llvm-svn: 280567
* ReaderWriter: Use ilist_noalloc_traits for TrieEdge, NFCDuncan P. N. Exon Smith2016-09-031-8/+3
| | | | | | | Adopt r280128 in lld, specializing ilist_alloc_traits rather than reinventing the wheel. llvm-svn: 280566
* ADT: Remove external uses of ilist_iterator, NFCDuncan P. N. Exon Smith2016-09-034-12/+4
| | | | | | | | | | | | Delete the dead code for Write(ilist_iterator) in the IR Verifier, inline report(ilist_iterator) at its call sites in the MachineVerifier, and use simple_ilist<>::iterator in SymbolTableListTraits. The only remaining reference to ilist_iterator outside of the ilist implementation is from MachineInstrBundleIterator. I'll get rid of that in a follow-up. llvm-svn: 280565
* ADT: Fix up IListTest.privateNode and get it passingDuncan P. N. Exon Smith2016-09-033-6/+14
| | | | | | | | This test was using the wrong type, and so not actually testing much. ilist_iterator constructors weren't going through ilist_node_access, so they didn't actually work with private inheritance. llvm-svn: 280564
* [SE] Add getByteCount methods for device memoryJason Henline2016-09-032-13/+22
| | | | | | | | | | | | | | Summary: Simple utility methods will prevent users from making mistakes when converting element counts to byte counts. Reviewers: jlebar Subscribers: jprice, parallel_libs-commits Differential Revision: https://reviews.llvm.org/D24197 llvm-svn: 280563
* [Sema] Fix how we set implicit conversion kinds.George Burgess IV2016-09-031-9/+15
| | | | | | | | | | | | We have invariants we like to guarantee for the `ImplicitConversionKind`s in a `StandardConversionSequence`. These weren't being upheld in code that r280553 touched, so Richard suggested that we should fix that. See D24113. I'm not entirely sure how to go about testing this, so no test case is included. Suggestions welcome. llvm-svn: 280562
* Define _LIBCPP_SAFE_STATIC ↵Eric Fiselier2016-09-032-4/+15
| | | | | | __attribute__((require_constant_initialization)), and apply it to memory_resource llvm-svn: 280561
* [PowerPC] Add asm parser/disassembler support for hrfid,nap,slbmfevHal Finkel2016-09-025-0/+47
| | | | | | | | These few book-III instructions are used by the Linux kernel. Partially fixes PR24796. llvm-svn: 280560
* [PowerPC] Add support for the extended dcbf form and mnemonicsHal Finkel2016-09-025-5/+64
| | | | | | | | | dcbf has an optional hint-like field, add support for the extended form and the associated mnemonics (dcbfl and dcbflp). Partially fixes PR24796. llvm-svn: 280559
* Dependences: Only create flat StmtSchedule in presence of reductionsTobias Grosser2016-09-021-1/+1
| | | | | | | | | Without reductions we do not need a flat union_map schedule describing the computation we want to perform, but can work purely on the schedule tree. This reduces the dependence computation and scheduling time from 33ms to 25ms. Another 30% reduction. llvm-svn: 280558
* Dependences: Exit early, if no reduction dependences are needed.Tobias Grosser2016-09-021-1/+12
| | | | | | | | | | | | In case we do not compute reduction dependences or dependences that are more fine-grained than statement level dependences, we can avoid the corresponding part of the dependence analysis all together. For the 3mm benchmark, this reduces scheduling + dependence analysis time from 62ms to 33ms for a no-asserts build. The majority of the compile time is anyhow spent in the LLVM backends, when doing code generation. Nevertheless, there is no need to waste compile time either. llvm-svn: 280557
* (clang part) Implement MASM-flavor intel syntax behavior for inline MS asm ↵Yunzhong Gao2016-09-023-21/+21
| | | | | | | | | | | | | | block. Clang tests for verifying the following syntaxes: 1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not. 0xNN and NNh may come with optional U or L suffix. 2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not. NNb may come with optional U or L suffix. Differential Revision: https://reviews.llvm.org/D22112 llvm-svn: 280556
* (LLVM part) Implement MASM-flavor intel syntax behavior for inline MS asm block:Yunzhong Gao2016-09-023-3/+48
| | | | | | | | | | | 1. 0xNN and NNh are accepted as valid hexadecimal numbers, but 0xNNh is not. 0xNN and NNh may come with optional U or L suffix. 2. NNb is accepted as a valid binary (base-2) number, but 0bNN is not. NNb may come with optional U or L suffix. Differential Revision: https://reviews.llvm.org/D22112 llvm-svn: 280555
* Introduce option to run isl AST generation, but no IR generation.Tobias Grosser2016-09-021-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | We replace the options -polly-code-generator=none =isl with the options -polly-code-generation=none =ast =full This allows us to measure the overhead of Polly itself, versus the compile time increases due to us generating more IR and consequently the LLVM backends spending more time on this IR. We also use this opportunity to rename the option. The original name was introduced at a point where we still had two code generators. CLooG and the isl AST generator. Since we only have one AST generator left, there is no need to distinguish between 'isl' and something else. However, being able to disable code generation all together has been shown useful for debugging. Hence, we rename and extend this option to make it a good fit for its new use case. llvm-svn: 280554
* [Sema] Relax overloading restrictions in C.George Burgess IV2016-09-029-43/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows us to perform incompatible pointer conversions when resolving overloads in C. So, the following code will no longer fail to compile (though it will still emit warnings, assuming the user hasn't opted out of them): ``` void foo(char *) __attribute__((overloadable)); void foo(int) __attribute__((overloadable)); void callFoo() { unsigned char bar[128]; foo(bar); // selects the char* overload. } ``` These conversions are ranked below all others, so: A. Any other viable conversion will win out B. If we had another incompatible pointer conversion in the example above (e.g. `void foo(int *)`), we would complain about an ambiguity. Differential Revision: https://reviews.llvm.org/D24113 llvm-svn: 280553
OpenPOWER on IntegriCloud