summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* Check only loop control of loops that are part of the regionJohannes Doerfert2016-04-253-4/+35
| | | | | | | This also removes a duplicated line of code in the region generator that caused a SPEC benchmark to fail with the new SCoPs. llvm-svn: 267404
* Initialize the invalid domain of an access with an empty setJohannes Doerfert2016-04-251-1/+3
| | | | llvm-svn: 267403
* Do not propagate invalid domains over back edgesJohannes Doerfert2016-04-251-0/+4
| | | | llvm-svn: 267402
* Introduce a parameter set type [NFC]Johannes Doerfert2016-04-255-26/+23
| | | | llvm-svn: 267401
* Remove unnecessary argument of the SCEVValidator [NFC]Johannes Doerfert2016-04-255-35/+20
| | | | llvm-svn: 267400
* Typo. NFC.Chad Rosier2016-04-251-1/+1
| | | | llvm-svn: 267399
* [Clang][AVX512][BUILTIN] Adding intrinsics for ↵Michael Zuckerman2016-04-254-0/+263
| | | | | | | | VSCATTERPF{1|0}{DPS|QPS|DPD|QPD} instruction set Differential Revision: http://reviews.llvm.org/D19313 llvm-svn: 267398
* [Hexagon] Correctly set "Flags" in ELF headerKrzysztof Parzyszek2016-04-252-3/+16
| | | | llvm-svn: 267397
* Simplify. NFC.Rafael Espindola2016-04-251-30/+18
| | | | llvm-svn: 267396
* [OPENMP 4.5] Codegen for 'taskloop' directive.Alexey Bataev2016-04-256-77/+748
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The taskloop construct specifies that the iterations of one or more associated loops will be executed in parallel using OpenMP tasks. The iterations are distributed across tasks created by the construct and scheduled to be executed. The next code will be generated for the taskloop directive: #pragma omp taskloop num_tasks(N) lastprivate(j) for( i=0; i<N*GRAIN*STRIDE-1; i+=STRIDE ) { int th = omp_get_thread_num(); #pragma omp atomic counter++; #pragma omp atomic th_counter[th]++; j = i; } Generated code: task = __kmpc_omp_task_alloc(NULL,gtid,1,sizeof(struct task),sizeof(struct shar),&task_entry); psh = task->shareds; psh->pth_counter = &th_counter; psh->pcounter = &counter; psh->pj = &j; task->lb = 0; task->ub = N*GRAIN*STRIDE-2; task->st = STRIDE; __kmpc_taskloop( NULL, // location gtid, // gtid task, // task structure 1, // if clause value &task->lb, // lower bound &task->ub, // upper bound STRIDE, // loop increment 0, // 1 if nogroup specified 2, // schedule type: 0-none, 1-grainsize, 2-num_tasks N, // schedule value (ignored for type 0) (void*)&__task_dup_entry // tasks duplication routine ); llvm-svn: 267395
* Delete needsCopyRelImpl. It is redundant with getRelExpr.Rafael Espindola2016-04-253-56/+17
| | | | llvm-svn: 267394
* [GlobalOpt] Allow constant globals to be SRA'dJames Molloy2016-04-252-5/+30
| | | | | | | | The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393
* Remove flaky decorator from two tests on linuxPavel Labath2016-04-252-4/+0
| | | | | | The flakyness is no longer reproducible, and the tests seem to be passing reliably now. llvm-svn: 267392
* [ELF] Delete extra line. NFCSimon Atanasyan2016-04-251-1/+0
| | | | llvm-svn: 267391
* [Coverage] Restore the correct count value after processing a nested region ↵Igor Kudrin2016-04-254-46/+80
| | | | | | | | | | | | | in case of combined regions. If several regions cover the same area of code, we have to restore the combined value for that area when return from a nested region. This patch achieves that by combining regions before calling buildSegments. Differential Revision: http://reviews.llvm.org/D18610 llvm-svn: 267390
* [SCEV] Improve the run-time checking of the NoWrap predicateSilviu Baranga2016-04-252-31/+190
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This implements a new method of run-time checking the NoWrap SCEV predicates, which should be easier to optimize and nicer for targets that don't correctly handle multiplication/addition of large integer types (like i128). If the AddRec is {a,+,b} and the backedge taken count is c, the idea is to check that |b| * c doesn't have unsigned overflow, and depending on the sign of b, that: a + |b| * c >= a (b >= 0) or a - |b| * c <= a (b <= 0) where the comparisons above are signed or unsigned, depending on the flag that we're checking. The advantage of doing this is that we avoid extending to a larger type and we avoid the multiplication of large types (multiplying i128 can be expensive). Reviewers: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D19266 llvm-svn: 267389
* [PowerPC] [PR27387] Disallow r0 for ADD8TLS.Marcin Koscielnicki2016-04-252-2/+47
| | | | | | | | | | | ADD8TLS, a variant of add instruction used for initial-exec TLS, currently accepts r0 as a source register. While add itself supports r0 just fine, linker can relax it to a local-exec sequence, converting it to addi - which doesn't support r0. Differential Revision: http://reviews.llvm.org/D19193 llvm-svn: 267388
* Run GlobalOpt before emitting the bitcode for ThinLTOMehdi Amini2016-04-251-0/+2
| | | | | | | | This is motivated by reducing the size of the IR and thus reduce compile time. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267385
* ThinLTO: Move createNameAnonFunctionPass insertion in PassManagerBuilder (NFC)Mehdi Amini2016-04-251-3/+4
| | | | | | | It is just code motion, but makes more sense this way. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267384
* fix commentsIgor Breger2016-04-251-2/+1
| | | | | | | related to Differential Revision: http://reviews.llvm.org/D17913 llvm-svn: 267383
* [ELF] - Implemented comparsion operators for linkerscript.George Rimar2016-04-253-8/+146
| | | | | | | | Patch adds support of <,>,!=,==,>=,<= operators. Differential revision: http://reviews.llvm.org/D19419 llvm-svn: 267382
* [ELF] - Removed dead declarations. NFC.George Rimar2016-04-251-5/+0
| | | | llvm-svn: 267381
* [Clang][AVX512][BuiltIn] Adding support to intrinsics of VPERMD and VPERMW ↵Michael Zuckerman2016-04-2513-1/+802
| | | | | | | | instruction set Differential Revision: http://reviews.llvm.org/D19195 llvm-svn: 267380
* Fixing wrong mask size error. From __mmask8 to __mmask16.Michael Zuckerman2016-04-252-6/+6
| | | | | | | Was reviewed over the shoulder by AsafBadouh. Connected to review http://reviews.llvm.org/D19195. llvm-svn: 267379
* [Support/ELFRelocs] Add R_386_GOT32X.Davide Italiano2016-04-251-1/+2
| | | | | | | | The new relocation recently defined in the Intel386 psABI was still missing from this file. A subsequent commit will add support for GOT32X in MC, together with a test. llvm-svn: 267378
* [X86] Replace a SmallVector used to pass 2 values to an ArrayRef parameter ↵Craig Topper2016-04-251-3/+1
| | | | | | with a fixed size array. NFC llvm-svn: 267377
* [esan] Fix uninitialized warning from interception contextDerek Bruening2016-04-251-0/+2
| | | | | | | | The interception context is not used by esan, but the compiler complains about it being uninitialized all the same. We set it to null to avoid the warning. llvm-svn: 267376
* Minor code cleanups. NFC.Junmo Park2016-04-254-23/+23
| | | | llvm-svn: 267375
* [llgo] llgoi: separate evaluation from printingAndrew Wilkins2016-04-257-94/+124
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Separate the evaluation of expressions from printing of results. This is in preparation for splitting the core of the interpreter out for use in alternative interpreter frontends. At the same time, the output is made less noisy in response to comments on the golang-nuts announcement. We would ideally print out values using Go syntax, but this is impractical until we have libgo based on Go 1.5. When that happens, fmt's %#v will handle reflect.Value better, and so we can fix/filter type names to remove automatically generated package names. Reviewers: pcc Subscribers: llvm-commits, axw Differential Revision: http://reviews.llvm.org/D13761 llvm-svn: 267374
* [X86] Add a complete set of tests for all operand sizes of cttz/ctlz with ↵Craig Topper2016-04-251-6/+123
| | | | | | and without zero undef being lowered to bsf/bsr. llvm-svn: 267373
* Add a --element-count option to the expression commandEnrico Granata2016-04-2514-13/+294
| | | | | | | | | | | | | This option evaluates an expression and, if the result is of pointer type, treats it as if it was an array of that many elements and displays such elements This has a couple subtle points but is mostly as straightforward as it sounds Add a parray N <expr> alias for this new mode Also, extend the --object-description mode to do the moral equivalent of the above but display each element in --object-description mode Add a poarray N <expr> alias for this llvm-svn: 267372
* Add a note to the test explaining why it doesn't match gold's behaviour.Peter Collingbourne2016-04-251-0/+3
| | | | llvm-svn: 267371
* Verifier: Verify that each inlinable callsite of a debug-info-bearing functionAdrian Prantl2016-04-244-2/+75
| | | | | | | | | | | | | in a debug-info-bearing function has a debug location attached to it. Failure to do so causes an "!dbg attachment points at wrong subprogram for function" assertion failure when the inliner sets up inline scope info. rdar://problem/25878916 This reaplies r267320 without changes after fixing an issue in the OpenMP IR generator in clang. llvm-svn: 267370
* Debug info: Apply an empty debug location for global OpenMP destructors.Adrian Prantl2016-04-242-2/+5
| | | | | | | | | | | | LLVM really wants a debug location on every inlinable call in a function with debug info, because it otherwise cannot set up inlining debug info. This change applies an artificial line 0 debug location (which is how DWARF marks automatically generated code that has no corresponding source code) to the .__kmpc_global_dtor_. functions to avoid the LLVM Verifier complaining. llvm-svn: 267369
* clang-format: [JS] generator and async functions.Martin Probst2016-04-245-11/+69
| | | | | | | | | | | | | | | For generators, see: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Iterators_and_generators async functions are not quite in the spec yet, but stage 3 and already widely used: http://tc39.github.io/ecmascript-asyncawait/ Reviewers: djasper Subscribers: klimek Differential Revision: http://reviews.llvm.org/D19204 llvm-svn: 267368
* Also check the IR.Rafael Espindola2016-04-241-0/+4
| | | | llvm-svn: 267367
* Add a test for how we handle protected visibility.Rafael Espindola2016-04-242-0/+22
| | | | llvm-svn: 267366
* unwind: remove unnecessary headerSaleem Abdulrasool2016-04-241-3/+0
| | | | | | | Availablity.h is not used within config.h. The locations which use the availability infrastructure already include the necessary header(s). NFC. llvm-svn: 267365
* unwind: unify _LIBUNWIND_ABORTSaleem Abdulrasool2016-04-241-18/+8
| | | | | | | | | | | Rather than use the `__assert_rtn` on libSystem based targets and a local `assert_rtn` function on others, expand the function definition into a macro which will perform the writing to stderr and then abort. This unifies the definition and behaviour across targets. Ensure that we flush stderr prior to aborting. llvm-svn: 267364
* Fix unwind failures when PC points beyond the end of a functionUlrich Weigand2016-04-242-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RegisterContextLLDB::InitializeNonZerothFrame already has code to attempt to detect and handle the case where the PC points beyond the end of a function, but there are certain cases where this doesn't work correctly. In fact, there are *two* different places where this detection is attempted, and the failure is in fact a result of an unfortunate interaction between those two separate attempts. First, the ResolveSymbolContextForAddress routine is called with the resolve_tail_call_address flag set to true. This causes the routine to internally accept a PC pointing beyond the end of a function, and still resolving the PC to that function symbol. Second, the InitializeNonZerothFrame routine itself maintains a "decr_pc_and_recompute_addr_range" flag and, if that turns out to be true, itself decrements the PC by one and searches again for a symbol at that new PC value. Both approaches correctly identify the symbol associated with the PC. However, the problem is now that later on, we also need to find the DWARF CFI record associated with the PC. This is done in the RegisterContextLLDB::GetFullUnwindPlanForFrame routine, and uses the "m_current_offset_backed_up_one" member variable. However, that variable only actually contains the PC "backed up by one" if the *second* approach above was taken. If the function was already identified via the first approach above, that member variable is *not* backed up by one but simply points to the original PC. This in turn causes GetEHFrameUnwindPlan to not correctly identify the DWARF CFI record associated with the PC. Now, in many cases, if the first method had to back up the PC by one, we *still* use the second method too, because of this piece of code: // Or if we're in the middle of the stack (and not "above" an asynchronous event like sigtramp), // and our "current" pc is the start of a function... if (m_sym_ctx_valid && GetNextFrame()->m_frame_type != eTrapHandlerFrame && GetNextFrame()->m_frame_type != eDebuggerFrame && addr_range.GetBaseAddress().IsValid() && addr_range.GetBaseAddress().GetSection() == m_current_pc.GetSection() && addr_range.GetBaseAddress().GetOffset() == m_current_pc.GetOffset()) { decr_pc_and_recompute_addr_range = true; } In many cases, when the PC is one beyond the end of the current function, it will indeed then be exactly at the start of the next function. But this is not always the case, e.g. if there happens to be alignment padding between the end of one function and the start of the next. In those cases, we may sucessfully look up the function symbol via ResolveSymbolContextForAddress, but *not* set decr_pc_and_recompute_addr_range, and therefore fail to find the correct DWARF CFI record. A very simple fix for this problem is to just never use the first method. Call ResolveSymbolContextForAddress with resolve_tail_call_address set to false, which will cause it to fail if the PC is beyond the end of the current function; or else, identify the next function if the PC is also at the start of the next function. In either case, we will then set the decr_pc_and_recompute_addr_range variable and back up the PC anyway, but this time also find the correct DWARF CFI. A related problem is that the ResolveSymbolContextForAddress sometimes returns a "symbol" with empty name. This turns out to be an ELF section symbol. Now, usually those get type eSymbolTypeInvalid. However, there is code in ObjectFileELF::ParseSymbols that tries to change the type of invalid symbols to eSymbolTypeCode or eSymbolTypeData if the symbol lies within the code or data section. Unfortunately, this check also hits the symbol for the code section itself, which is then marked as eSymbolTypeCode. While the size of the section symbol is 0 according to the ELF file, LLDB considers this size invalid and attempts to figure out the "correct" size. Depending on how this goes, we may end up with a symbol that overlays part of the code section, even outside areas covered by real function symbols. Therefore, if we call ResolveSymbolContextForAddress with PC pointing beyond the end of a function, we may get this bogus section symbol. This again means InitializeNonZerothFrame thinks we have a valid PC, but then we don't find any unwind info for it. The fix for this problem is me to simply always leave ELF section symbols as type eSymbolTypeInvalid. Differential Revision: http://reviews.llvm.org/D18975 llvm-svn: 267363
* [X86][AVX] Added PR24935 test caseSimon Pilgrim2016-04-241-0/+39
| | | | llvm-svn: 267362
* ARM: fix __chkstk Frame Setup on WoASaleem Abdulrasool2016-04-246-11/+13
| | | | | | | | | | | | This corrects the MI annotations for the stack adjustment following the __chkstk invocation. We were marking the original SP usage as a Def rather than Kill. The (new) assigned value is the definition, the original reference is killed. Adjust the ISelLowering to mark Kills and FrameSetup as well. This partially resolves PR27480. llvm-svn: 267361
* Tweak comments to make it clear that these combines are for SSE scalar ↵Simon Pilgrim2016-04-241-4/+5
| | | | | | instructions. llvm-svn: 267360
* [InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is requiredSimon Pilgrim2016-04-243-11/+11
| | | | | | As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359
* [ELF] Reinstate 'else' which was previously removed.Davide Italiano2016-04-241-1/+2
| | | | | | It turns out it's actually needed. llvm-svn: 267358
* [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)Simon Pilgrim2016-04-245-183/+98
| | | | | | | | | | | | | | | | Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357
* [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)Simon Pilgrim2016-04-244-145/+126
| | | | | | | | | | | | This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356
* [InstCombine] Avoid updating argument demanded elements in separate passes.Simon Pilgrim2016-04-241-7/+15
| | | | | | As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can. llvm-svn: 267355
* Fix typo in comment. NFCNick Lewycky2016-04-241-1/+1
| | | | llvm-svn: 267354
* Remove emacs mode markers from .cpp files. NFCNick Lewycky2016-04-242-2/+2
| | | | | | .cpp files are unambiguously C++, you only need the mode markers on .h files. llvm-svn: 267353
OpenPOWER on IntegriCloud