summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[RegAllocGreedy] Attempt to split unspillable live intervals"Dylan McKay2016-09-301-8/+6
| | | | | | It was accidentally committed. llvm-svn: 282855
* [RegAllocGreedy] Attempt to split unspillable live intervalsDylan McKay2016-09-301-6/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 282852
* Clamp version number in S_COMPILE3 to avoid overflowing 16-bit field.Adrian McCarthy2016-09-291-5/+6
| | | | llvm-svn: 282761
* [RegisterBankInfo] Change the default mapping for Copy and PHI.Quentin Colombet2016-09-292-54/+37
| | | | | | | | | | | | | | Instead of producing a mapping for all the operands, we only generate a mapping for the definition. Indeed, the other operands are not constrained by the instruction and thus, we should leave the choice to the actual definition to do the right thing. In pratice this is almost NFC, but with one advantage. We will have only one instance of OperandsMapping for each copy and phi that map to one register bank instead of one different instance for each different number of operands for each copy and phi. llvm-svn: 282756
* [codeview] Use character types for all byte-sized integer typesReid Kleckner2016-09-291-10/+10
| | | | | | | | | | | | | | | The VS debugger doesn't appear to understand the 0x68 or 0x69 type indices, which were probably intended for use on a platform where a C 'int' is 8 bits. So, use the character types instead. Clang was already using the character types because '[u]int8_t' is usually defined in terms of 'char'. See the Rust issue for screenshots of what VS does: https://github.com/rust-lang/rust/issues/36646 Fixes PR30552 llvm-svn: 282739
* MachineFunction: Add missing newline in debug print()Matthias Braun2016-09-291-0/+1
| | | | | | Should not be a functional but an aesthetic change. llvm-svn: 282669
* [RegisterBankInfo] Uniquely generate OperandsMapping.Quentin Colombet2016-09-281-8/+63
| | | | | | | | | | | This is a step toward statically allocate InstructionMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. This should already improve compile time by getting rid of a bunch of memmove of SmallVectors. llvm-svn: 282643
* [RegisterBankInfo] Rework the APIs of ValueMapping.Quentin Colombet2016-09-281-10/+12
| | | | | | | This is a preparatory commit for more TableGen-like structure. NFC llvm-svn: 282642
* Remove dead code from LiveDebugVariables.cpp (NFC)Adrian Prantl2016-09-281-44/+10
| | | | | | | | LiveDebugVariables doesn't propagate DBG_VALUEs accross basic block boundaries any more; this functionality was split into LiveDebugValues. We can thus drop the now dead references to LexicalScopes from LiveDebugVariables. llvm-svn: 282638
* IfConversion: Add implicit uses for redefined regs with live subregistersKrzysztof Parzyszek2016-09-281-0/+11
| | | | | | | | | | Normally, if conversion would add implicit uses for redefined registers, e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of R0 are known to be live but not R0 itself, such implicit uses will not be added, causing prior definitions of such subregisters and R0 itself to become dead. llvm-svn: 282626
* Teach LiveDebugValues about lexical scopes.Adrian Prantl2016-09-281-8/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses PR26055 LiveDebugValues is very slow. Contrary to the old LiveDebugVariables pass LiveDebugValues currently doesn't look at the lexical scopes before inserting a DBG_VALUE intrinsic. This means that we often propagate DBG_VALUEs much further down than necessary. This is especially noticeable in large C++ functions with many inlined method calls that all use the same "this"-pointer. For example, in the following code it makes no sense to propagate the inlined variable a from the first inlined call to f() into any of the subsequent basic blocks, because the variable will always be out of scope: void sink(int a); void __attribute((always_inline)) f(int a) { sink(a); } void foo(int i) { f(i); if (i) f(i); f(i); } This patch reuses the LexicalScopes infrastructure we have for LiveDebugVariables to take this into account. The effect on compile time and memory consumption is quite noticeable: I tested a benchmark that is a large C++ source with an enormous amount of inlined "this"-pointers that would previously eat >24GiB (most of them for DBG_VALUE intrinsics) and whose compile time was dominated by LiveDebugValues. With this patch applied the memory consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues. https://reviews.llvm.org/D24994 Thanks to Daniel Berlin and Keith Walker for reviewing! llvm-svn: 282611
* Rewrite loops to use range-based for. (NFC)Adrian Prantl2016-09-281-17/+5
| | | | llvm-svn: 282608
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-09-282-121/+272
| | | | | | | | UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-09-282-272/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600
* [DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombineMichael Kuperstein2016-09-281-7/+0
| | | | | | | | | | | | This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567
* [TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg().Geoff Berry2016-09-271-1/+5
| | | | | | | | | | | | | | | | | | | Summary: The current implementation of isConstantPhysReg() checks for defs of physical registers to determine if they are constant. Some architectures (e.g. AArch64 XZR/WZR) have registers that are constant and may be used as destinations to indicate the generated value is discarded, preventing isConstantPhysReg() from returning true. This change adds a TargetRegisterInfo hook that overrides the no defs check for cases such as this. Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy Subscribers: junbuml, aemerson, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D24570 llvm-svn: 282543
* Propagate DBG_VALUE entries when there are unvisited predecessorsKeith Walker2016-09-271-10/+24
| | | | | | | | | | | | | | | | | Variables are sometimes missing their debug location information in blocks in which the variables should be available. This would occur when one or more predecessor blocks had not yet been visited by the routine which propagated the information from predecessor blocks. This is addressed by only considering predecessor blocks which have already been visited. The solution to this problem was suggested by Daniel Berlin on the LLVM developer mailing list. Differential Revision: https://reviews.llvm.org/D24927 llvm-svn: 282506
* Add support to optionally limit the size of jump tables.Evandro Menezes2016-09-262-12/+38
| | | | | | | | | | | | | | | | | | | Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412
* [ARM] Promote small global constants to constant poolsJames Molloy2016-09-261-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282387
* [X86][avx512] Fix bug in masked compress store.Ayman Musa2016-09-261-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D23984 llvm-svn: 282381
* [RegisterBankInfo] Constify the member of the XXXMapping maps.Quentin Colombet2016-09-241-2/+2
| | | | | | | This makes it obvious that items in those maps behave like statically created objects. llvm-svn: 282327
* [RegisterBankInfo] Add statistics for dynamic value mappings.Quentin Colombet2016-09-241-0/+8
| | | | | | | Like partial mappings, as we move toward TableGen'ed information, the number should reach zero eventually. llvm-svn: 282325
* [RegisterBankInfo] Uniquely generate ValueMapping.Quentin Colombet2016-09-241-11/+52
| | | | | | | | This is a step toward statically allocate ValueMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. llvm-svn: 282324
* [RegisterBankInfo] Keep valid pointers for PartialMappings.Quentin Colombet2016-09-241-4/+9
| | | | | | | | | | | | | | | Previously we were using the address of the unique instance of a partial mapping in the related map to access this instance. However, when the map grows, the whole set of instances may be moved elsewhere and the previous addresses are not valid anymore. Instead, keep the address of the unique heap allocated instance of a partial mapping. Note: I did not see any actual bugs for that problem as the number of partial mappings dynamically allocated is small (<= 4). llvm-svn: 282323
* llc: Add -start-before/-stop-before optionsMatthias Braun2016-09-232-8/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D23089 llvm-svn: 282302
* [ResetMachineFunction] Populate the comments in the header of the file.Quentin Colombet2016-09-231-3/+6
| | | | | | NFC llvm-svn: 282276
* [ResetMachineFunction] Add statistic on the number of reset functions.Quentin Colombet2016-09-231-0/+4
| | | | | | | | As the development of GlobalISel move forward, this statistic should strictly decrease until it reaches zero. At this point, it would mean GlobalISel can replace SDISel (at least on the tested inputs :P). llvm-svn: 282275
* [RegisterBankInfo] Add statistics for dynamic partial mappings.Quentin Colombet2016-09-231-0/+11
| | | | | | | | Collect statistics about the number of partial mappings dynamically allocated and accessed. Ultimately, when the whole TableGen infrastructure is set, those numbers should be zero. llvm-svn: 282274
* ScheduleDAG: Match enum names when printing sdep kindsMatthias Braun2016-09-231-8/+8
| | | | | | | It is less confusing to have the same names in the debug print as the enum members. llvm-svn: 282273
* [RegBankSelect] Use DEBUG_TYPE instead of repeating the name of the passQuentin Colombet2016-09-231-2/+2
| | | | | | NFC llvm-svn: 282267
* [RegisterBank] Mark the dump method with LLVM_DUMP_METHOD.Quentin Colombet2016-09-231-1/+1
| | | | | | NFC llvm-svn: 282266
* [RegisterBankInfo] Mark the dump methods with LLVM_DUMP_METHOD.Quentin Colombet2016-09-231-4/+4
| | | | | | NFC llvm-svn: 282221
* [RegisterBankInfo] Check that the mapping covers the interesting bits.Quentin Colombet2016-09-231-2/+3
| | | | | | | | | | | | | | | In the verify method of the ValueMapping class we used to check that the mapping exactly matches the bits of the input value. This is problematic for statically allocated mappings because we would need a different mapping for each different size of the value that maps on one instruction. For instance, with such scheme, we would need a different mapping for a value of size 1, 5, 23 whereas they all end up on a 32-bit wide instruction. Therefore, change the verifier to check that the meaningful bits are covered by the mapping instead of matching them. llvm-svn: 282214
* [RegisterBankInfo] Use array instead of SmallVector for BreakDown.Quentin Colombet2016-09-232-42/+42
| | | | | | | | | | | | | This is another step toward TableGen'ed like structures. The BreakDown of the mapping of the value will be statically computed by TableGen, thus we only have to point to the right entry in the table instead of dynamically allocate the mapping for each instruction. We still support the dynamic allocation through a factory of PartialMapping to ease the bring-up of the targets while the TableGen backend is not available. llvm-svn: 282213
* MachineScheduler: Slightly simplify release nodeMatthias Braun2016-09-221-14/+0
| | | | llvm-svn: 282201
* MachineScheduler: Remove ineffective heuristic; NFCMatthias Braun2016-09-221-11/+0
| | | | | | | | | Currently all nodes get added to the NextSU list when they are released, so any candidate must be in that list, making the heuristic ineffective. Remove it for now, we can add it back later in a working fashion if necessary. llvm-svn: 282200
* Win64: Don't emit unwind info for "leaf" functions (PR30337)Hans Wennborg2016-09-221-1/+1
| | | | | | | | | | | | According to MSDN (see the PR), functions which don't touch any callee-saved registers (including %rsp) don't need any unwind info. This patch makes LLVM not emit unwind info for such functions, to save binary size. Differential Revision: https://reviews.llvm.org/D24748 llvm-svn: 282185
* [DAG] Fix incorrect alignment of ext load.Nirav Dave2016-09-221-1/+1
| | | | | | | | | | | | Correctly use alignment size from loaded size not output value size. Reviewers: jyknight, tstellarAMD, arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23356 llvm-svn: 282177
* GlobalISel: handle stack-based parameters on AArch64.Tim Northover2016-09-221-0/+9
| | | | llvm-svn: 282153
* [RegisterBankInfo] Move to statically allocated RegisterBank.Quentin Colombet2016-09-221-3/+8
| | | | | | | | | | | | This commit is basically the first step toward what will RegisterBankInfo look when it gets TableGen'ed. It introduces a XXXGenRegisterBankInfo.def file that is what TableGen will issue at some point. Moreover, the RegBanks field in RegisterBankInfo changed to reflect the static (compile time) aspect of the information. llvm-svn: 282131
* [RegisterBankInfo] Take advantage of the extra argument of SmallVector::resize.Quentin Colombet2016-09-221-3/+1
| | | | | | | | When initializing an instance of OperandsMapper, instead of using SmallVector::resize followed by std::fill, use the function that directly does that in SmallVector. llvm-svn: 282130
* [MIRParser] Delete dead code. NFCI.Davide Italiano2016-09-211-12/+0
| | | | llvm-svn: 282098
* Disable tail calls if there is an swifterror argumentArnold Schwaighofer2016-09-211-0/+5
| | | | | | | | | ISel does not handle them correctly yet i.e we crash trying to emit tail call code. radar://28407842 llvm-svn: 282088
* GlobalISel: produce correct code for signext/zeroext ABI flags.Tim Northover2016-09-213-7/+89
| | | | | | | | We still don't really have an equivalent of "AssertXExt" in DAG, so we don't exploit the guarantees on the receiving side yet, but this should produce conservatively correct code on iOS ABIs. llvm-svn: 282069
* GlobalISel: pass Function to lowerFormalArguments directly (NFC).Tim Northover2016-09-211-2/+1
| | | | | | | | The only implementation that exists immediately looks it up anyway, and the information is needed to handle various parameter attributes (stored on the function itself). llvm-svn: 282068
* Revert "Remove extra argument used once onEric Christopher2016-09-201-8/+1
| | | | | | | | | | | | TargetMachine::getNameWithPrefix and inline the result into the singular caller." and "Remove more guts of TargetMachine::getNameWithPrefix and migrate one check to the TLOF mach-o version." temporarily until I can get the whole call migrated out of the TargetMachine as we could hit places where TLOF isn't valid. This reverts commits r281981 and r281983. llvm-svn: 282028
* [CodeGen] stop short-circuiting the SSP code for sspstrong.George Burgess IV2016-09-201-5/+0
| | | | | | | | | | | | | | | | | | This check caused us to skip adding layout information for calls to alloca in sspreq/sspstrong mode. We check properly for sspstrong later on (and add the correct layout info when doing so), so removing this shouldn't hurt. No test is included, since testing this using lit seems to require checking for exact offsets in asm, which is something that the lit tests for this avoid. If someone cares deeply, I'm happy to write a unittest or something to cover this, but that feels like overkill. Patch by Daniel Micay. Differential Revision: https://reviews.llvm.org/D22714 llvm-svn: 282022
* Mark ELF sections whose name start with .note as notePetr Hosek2016-09-201-0/+5
| | | | | | | | | | | Previously, such section would be marked as SHT_PROGBITS which makes it impossible to use an initialized C variable declaration to emit an (allocated) ELF note. The new behavior is also consistent with ELF assembly parser. Differential Revision: https://reviews.llvm.org/D24692 llvm-svn: 282010
* Fix syntactical nit from r281990.Adrian McCarthy2016-09-201-3/+3
| | | | llvm-svn: 281991
* Emit S_COMPILE3 CodeView recordAdrian McCarthy2016-09-202-0/+129
| | | | | | | | | | CodeView has an S_COMPILE3 record to identify the compiler and source language of the compiland. This record comes first in the debug$S section for the compiland. The debuggers rely on this record to know the source language of the code. There was a little test fallout from introducing a new record into the symbols subsection. Differential Revision: https://reviews.llvm.org/D24317 llvm-svn: 281990
OpenPOWER on IntegriCloud