summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* LTO: Fix use-after-scope error.Peter Collingbourne2016-09-291-0/+1
| | | | llvm-svn: 282665
* AArch64: Set shift bit of TLSLE HI12 add instructionLei Liu2016-09-291-0/+8
| | | | | | | | | | | | Summary: AArch64 LLVM assembler emits add instruction without shift bit to calculate the higher 12-bit address of TLS variables in local exec model. This generates wrong code sequence to access TLS variables with thread offset larger than 0x1000. Reviewers: t.p.northover, peter.smith, rovka Subscribers: salim.nasser, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D24702 llvm-svn: 282661
* Wisely choose sext or zext when widening IV.Evgeny Stupachenko2016-09-281-37/+88
| | | | | | | | | | | | Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650
* Next set of additional error checks for invalid Mach-O files for theKevin Enderby2016-09-281-0/+32
| | | | | | | | | load command that uses the Mach::rpath_command type but not used in llvm libObject code but used in llvm tool code. This includes just the LC_RPATH load command. llvm-svn: 282649
* [RegisterBankInfo] Uniquely generate OperandsMapping.Quentin Colombet2016-09-282-24/+90
| | | | | | | | | | | This is a step toward statically allocate InstructionMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. This should already improve compile time by getting rid of a bunch of memmove of SmallVectors. llvm-svn: 282643
* [RegisterBankInfo] Rework the APIs of ValueMapping.Quentin Colombet2016-09-281-10/+12
| | | | | | | This is a preparatory commit for more TableGen-like structure. NFC llvm-svn: 282642
* Remove dead code from LiveDebugVariables.cpp (NFC)Adrian Prantl2016-09-281-44/+10
| | | | | | | | LiveDebugVariables doesn't propagate DBG_VALUEs accross basic block boundaries any more; this functionality was split into LiveDebugValues. We can thus drop the now dead references to LexicalScopes from LiveDebugVariables. llvm-svn: 282638
* Next set of additional error checks for invalid Mach-O files for theKevin Enderby2016-09-281-0/+32
| | | | | | | | | | other load commands that use the Mach::version_min_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_VERSION_MIN_MACOSX, LC_VERSION_MIN_IPHONEOS, LC_VERSION_MIN_TVOS and LC_VERSION_MIN_WATCHOS load commands. llvm-svn: 282635
* Refactor the ProfileSummaryInfo to use doInitialization and doFinalization ↵Dehao Chen2016-09-284-20/+15
| | | | | | | | | | | | | | to handle Module update. Summary: This refactors the change in r282616 Reviewers: davidxl, eraman, mehdi_amini Subscribers: mehdi_amini, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D25041 llvm-svn: 282630
* IfConversion: Add implicit uses for redefined regs with live subregistersKrzysztof Parzyszek2016-09-281-0/+11
| | | | | | | | | | Normally, if conversion would add implicit uses for redefined registers, e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of R0 are known to be live but not R0 itself, such implicit uses will not be added, causing prior definitions of such subregisters and R0 itself to become dead. llvm-svn: 282626
* [AMDGPU] Promote uniform i16 ops to i32 ops for targets that have 16 bit ↵Konstantin Zhuravlyov2016-09-282-3/+238
| | | | | | | | instructions Differential Revision: https://reviews.llvm.org/D24125 llvm-svn: 282624
* Fix the bug introduced in r282616.Dehao Chen2016-09-281-1/+1
| | | | llvm-svn: 282618
* Fix the bug when -compile-twice is specified, the PSI will be invalidated.Dehao Chen2016-09-281-3/+12
| | | | | | | | | | | | | | Summary: When using llc with -compile-twice, module is generated twice, but getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI will still get the old PSI with the original (invalidated) Module. This patch checks if the module has changed when calling getPSI, if yes, update the module and invalidate the Summary. The bug does not show up in the current llc because PSI is not used in CodeGen yet. But with https://reviews.llvm.org/D24989, the bug will be exposed by test/CodeGen/PowerPC/pr26378.ll Reviewers: eraman, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24993 llvm-svn: 282616
* Don't look through addrspacecast in GetPointerBaseWithConstantOffsetArtur Pilipenko2016-09-281-2/+7
| | | | | | | | | | | | Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612
* Teach LiveDebugValues about lexical scopes.Adrian Prantl2016-09-281-8/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses PR26055 LiveDebugValues is very slow. Contrary to the old LiveDebugVariables pass LiveDebugValues currently doesn't look at the lexical scopes before inserting a DBG_VALUE intrinsic. This means that we often propagate DBG_VALUEs much further down than necessary. This is especially noticeable in large C++ functions with many inlined method calls that all use the same "this"-pointer. For example, in the following code it makes no sense to propagate the inlined variable a from the first inlined call to f() into any of the subsequent basic blocks, because the variable will always be out of scope: void sink(int a); void __attribute((always_inline)) f(int a) { sink(a); } void foo(int i) { f(i); if (i) f(i); f(i); } This patch reuses the LexicalScopes infrastructure we have for LiveDebugVariables to take this into account. The effect on compile time and memory consumption is quite noticeable: I tested a benchmark that is a large C++ source with an enormous amount of inlined "this"-pointers that would previously eat >24GiB (most of them for DBG_VALUE intrinsics) and whose compile time was dominated by LiveDebugValues. With this patch applied the memory consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues. https://reviews.llvm.org/D24994 Thanks to Daniel Berlin and Keith Walker for reviewing! llvm-svn: 282611
* Rewrite loops to use range-based for. (NFC)Adrian Prantl2016-09-281-17/+5
| | | | llvm-svn: 282608
* [NVPTX] Added intrinsics for atom.gen.{sys|cta}.* instructions.Artem Belevich2016-09-287-16/+263
| | | | | | | | These are only available on sm_60+ GPUs. Differential Revision: https://reviews.llvm.org/D24943 llvm-svn: 282607
* [SCEV] Use a SmallPtrSet as a temporary union predicate; NFCSanjoy Das2016-09-281-55/+65
| | | | | | | | | | | | | | | | Summary: Instead of creating and destroying SCEVUnionPredicate instances (which internally creates and destroys a DenseMap), use temporary SmallPtrSet instances of remember the set of predicates that will get reified into a SCEVUnionPredicate. Reviewers: silviu.baranga, sbaranga Subscribers: sanjoy, mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25000 llvm-svn: 282606
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-09-283-121/+282
| | | | | | | | UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604
* [AVR] Rename the builtin calling convention namesDylan McKay2016-09-281-3/+3
| | | | | | 'BUILTIN' is clearer than 'RT' in this context. llvm-svn: 282602
* [x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel)Marina Yatsina2016-09-281-0/+6
| | | | | | | | | | Implement 'retn' simply by aliasing it to the relevant 'ret' instruction Commit on behalf of coby Differential Revision: https://reviews.llvm.org/D24346 llvm-svn: 282601
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-09-283-282/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600
* [AVR] Import the LLVM namespace inside AVRMCTargetDesc.cppDylan McKay2016-09-281-0/+2
| | | | llvm-svn: 282598
* [AVR] Add AVRMCTargetDesc.cppDylan McKay2016-09-283-4/+97
| | | | | | | | | | | | | | Summary: This adds the AVRMCTargetDesc file in tree. It allows creation of the core classes used in the backend. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny Differential Revision: https://reviews.llvm.org/D25023 llvm-svn: 282597
* [AVR] Update the signature of createAVRAsmBackendDylan McKay2016-09-281-1/+3
| | | | | | It has been recently changed to also take a MCTargetOptions structure. llvm-svn: 282594
* [AVR] Enable the assembly parserDylan McKay2016-09-282-15/+18
| | | | | | | | We very recently landed the code. This commit enables the parser. It also adds a missing include to AVRAsmParser.cpp llvm-svn: 282593
* [InstSimplify] allow or-of-icmps folds with vector splat constantsSanjay Patel2016-09-281-12/+11
| | | | llvm-svn: 282592
* [InstSimplify] allow and-of-icmps folds with vector splat constantsSanjay Patel2016-09-281-10/+9
| | | | llvm-svn: 282590
* [AVR] Merge most recent changes to AVRInstrInfo.tdDylan McKay2016-09-281-21/+85
| | | | | | | | | This adds two new things: - Operand types per fixup - Atomic pseudo operations llvm-svn: 282588
* [AVR] Update the data layoutDylan McKay2016-09-281-1/+3
| | | | | | | | | | | | | | | The previous data layout caused issues when dealing with atomics. Foe example, it is illegal to load a 16-bit value with less than 16-bits of alignment. This changes the data layout so that all types are aligned by at least their own width. Interestingly, this also _slightly_ decreased register pressure in some cases. llvm-svn: 282587
* [AVR] Handle AVR relocations when handling ELF filesDylan McKay2016-09-281-0/+7
| | | | llvm-svn: 282586
* [AVR] Add assembly parserDylan McKay2016-09-285-1/+658
| | | | | | | | | | | | Summary: This patch adds the AVRAsmParser library. Reviewers: arsenm, kparzysz Subscribers: wdng, beanz, mgorny, kparzysz, simoncook, jtbandes, llvm-commits Differential Revision: https://reviews.llvm.org/D20046 llvm-svn: 282584
* [X86][FastISel] Use a COPY from K register to a GPR instead of a K operationGuy Blank2016-09-281-27/+31
| | | | | | | | | | | The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580
* Strip trailing whitespaceSimon Pilgrim2016-09-281-1/+1
| | | | llvm-svn: 282579
* [SystemZ] Implementation of getUnrollingPreferences().Jonas Paulsson2016-09-283-6/+62
| | | | | | | | | | | | | | This commit enables more unrolling for SystemZ by implementing the SystemZTargetTransformInfo::getUnrollingPreferences() method. It has been found that it is better to only unroll moderately, so the DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order to set this to a lower value for SystemZ (4). Reviewers: Evgeny Stupachenko, Ulrich Weigand. https://reviews.llvm.org/D24451 llvm-svn: 282570
* [DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombineMichael Kuperstein2016-09-281-7/+0
| | | | | | | | | | | | This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567
* [libFuzzer] speedup TracePC::FinalizeTraceKostya Serebryany2016-09-282-15/+22
| | | | llvm-svn: 282562
* [LAA] Rename emitAnalysis to recordAnalys. NFCAdam Nemet2016-09-281-23/+20
| | | | | | | | | Ever since LAA was split out into an analysis on its own, this function stopped emitting the report directly. Instead it stores it to be retrieved by the client which can then emit it as its own report (e.g. -Rpass-analysis=loop-vectorize). llvm-svn: 282561
* [Inliner] Port all opt remarks to new streaming APIAdam Nemet2016-09-273-35/+65
| | | | llvm-svn: 282559
* Next set of additional error checks for invalid Mach-O files for theKevin Enderby2016-09-271-0/+38
| | | | | | | | | | other load commands that use the MachO::dylinker_command type but not used in llvm libObject code but used in llvm tool code. This includes LC_ID_DYLINKER, LC_LOAD_DYLINKER and LC_DYLD_ENVIRONMENT load commands. llvm-svn: 282553
* [AArch64][RegisterBankInfo] Switch to statically allocated ValueMapping.Quentin Colombet2016-09-272-10/+24
| | | | | | | | Another step toward TableGen'ed like structure for the RegisterBankInfo of AArch64. By doing this, we also save a bit of compile time for the exact same output. llvm-svn: 282550
* [AArch64][RegisterBankInfo] Fix copy/paste in comments.Quentin Colombet2016-09-271-3/+3
| | | | | | NFC. llvm-svn: 282549
* [x86] add folds for FP logic with vector zerosSanjay Patel2016-09-271-17/+34
| | | | | | | | | | | | The 'or' case shows up in copysign. The copysign code also had redundant checking for a scalar zero operand with 'and', so I removed that. I'm not sure how to test vector 'and', 'andn', and 'xor' yet, but it seems better to just include all of the logic ops since we're fixing 'or' anyway. llvm-svn: 282546
* Shorten DiagnosticInfoOptimizationRemark* to OptimizationRemark*. NFCAdam Nemet2016-09-276-31/+26
| | | | | | | With the new streaming interface, these class names need to be typed a lot and it's way too looong. llvm-svn: 282544
* [TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg().Geoff Berry2016-09-273-1/+10
| | | | | | | | | | | | | | | | | | | Summary: The current implementation of isConstantPhysReg() checks for defs of physical registers to determine if they are constant. Some architectures (e.g. AArch64 XZR/WZR) have registers that are constant and may be used as destinations to indicate the generated value is discarded, preventing isConstantPhysReg() from returning true. This change adds a TargetRegisterInfo hook that overrides the no defs check for cases such as this. Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy Subscribers: junbuml, aemerson, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D24570 llvm-svn: 282543
* [Inliner] Fold the analysis remark into the missed remarkAdam Nemet2016-09-271-6/+4
| | | | | | | | | | | | | | | There is really no reason for these to be separate. The vectorizer started this pretty bad tradition that the text of the missed remarks is pretty meaningless, i.e. vectorization failed. There, you have to query analysis to get the full picture. I think we should just explain the reason for missing the optimization in the missed remark when possible. Analysis remarks should provide information that the pass gathers regardless whether the optimization is passing or not. llvm-svn: 282542
* [LoopSimplify] When simplifying phis in loop-simplify, do it only if it ↵Michael Zolotukhin2016-09-271-2/+4
| | | | | | preserves LCSSA form. llvm-svn: 282541
* Output optimization remarks in YAMLAdam Nemet2016-09-275-6/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Re-committed after moving the template specialization under the yaml namespace. GCC was complaining about this.) This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282539
* Sort headersAdam Nemet2016-09-271-1/+1
| | | | llvm-svn: 282538
* Statistic: Bring back printing on exit by defaultMatthias Braun2016-09-271-2/+4
| | | | | | | | Turns out several external projects relied on llvm printing statistics on exit. Let's go back to this behaviour by default and have an optional parameter to disable it. llvm-svn: 282532
OpenPOWER on IntegriCloud