summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix machine operand traversal in ScheduleDAGInstrs::fixupKillsKrzysztof Parzyszek2016-10-051-2/+7
| | | | llvm-svn: 283315
* Re-commit "Use StringRef in Support/Darf APIs (NFC)"Mehdi Amini2016-10-051-4/+4
| | | | | | | This reverts commit r283285 and re-commit r283275 with a fix for format("%s", Str); where Str is a StringRef. llvm-svn: 283298
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-053-326/+41
| | | | | | | | | | This reverts commit 062ace9764953e9769142c1099281a345f9b6bdc. Issue with loop info and block removal revealed by polly. I have a fix for this issue already in another patch, I'll re-roll this together with that fix, and a test case. llvm-svn: 283292
* Use StringRef in FastISel API (NFC)Mehdi Amini2016-10-051-1/+1
| | | | llvm-svn: 283291
* Revert "Re-commit "Use StringRef in Support/Darf APIs (NFC)""Mehdi Amini2016-10-051-4/+4
| | | | | | One test seems randomly broken: DebugInfo/X86/gnu-public-names.ll llvm-svn: 283285
* Re-commit "Use StringRef in Support/Darf APIs (NFC)"Mehdi Amini2016-10-051-4/+4
| | | | | | | This reverts commit r283278 and re-commit r283275 with the update to fix the build on the LLDB side. llvm-svn: 283281
* Revert "Use StringRef in Support/Darf APIs (NFC)"Mehdi Amini2016-10-051-4/+4
| | | | | | This reverts commit r283275, it broke LLDB Android debug server. llvm-svn: 283278
* Use StringRef in Support/Darf APIs (NFC)Mehdi Amini2016-10-041-4/+4
| | | | llvm-svn: 283275
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-043-41/+326
| | | | | | | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. Issue from previous rollback fixed, and a new test was added for that case as well. Differential revision: https://reviews.llvm.org/D18226 llvm-svn: 283274
* Revert r283248. It caused failures in the hexagon buildbots.David L Kreitzer2016-10-041-6/+7
| | | | llvm-svn: 283254
* [Target] move reciprocal estimate settings from TargetOptions to TargetLoweringSanjay Patel2016-10-041-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | The motivation for the change is that we can't have pseudo-global settings for codegen living in TargetOptions because that doesn't work with LTO. Ideally, these reciprocal attributes will be moved to the instruction-level via FMF, metadata, or something else. But making them function attributes is at least an improvement over the current state. The ingredients of this patch are: Remove the reciprocal estimate command-line debug option. Add TargetRecip to TargetLowering. Remove TargetRecip from TargetOptions. Clean up the TargetRecip implementation to work with this new scheme. Set the default reciprocal settings in TargetLoweringBase (everything is off). Update the PowerPC defaults, users, and tests. Update the x86 defaults, users, and tests. Note that if this patch needs to be reverted, the related clang patch checked in at r283251 should be reverted too. Differential Revision: https://reviews.llvm.org/D24816 llvm-svn: 283252
* [safestack] Requires a valid TargetMachine to be passed to the SafeStack pass.David L Kreitzer2016-10-041-7/+6
| | | | | | | | Patch by Michael LeMay Differential revision: http://reviews.llvm.org/D24896 llvm-svn: 283248
* [SelectionDAG] Fix calling convention in expansion of ?MULO.whitequark2016-10-041-2/+12
| | | | | | | | | | | | | | | | | | | | | The SMULO/UMULO DAG nodes, when not directly supported by the target, expand to a multiplication twice as wide. In case that the resulting type is not legal, an __mul?i3 intrinsic is used. Since the type is not legal, the legalizer cannot directly call the intrinsic with the wide arguments; instead, it "pre-lowers" them by splitting them in halves. The "pre-lowering" code in essence made assumptions about the calling convention, specifically that i(N*2) values will be split into two iN values and passed in consecutive registers in little-endian order. This, naturally, breaks on a big-endian system, such as our OR1K out-of-tree backend. Thanks to James Miller <james@aatch.net> for help in debugging. Differential Revision: https://reviews.llvm.org/D25223 llvm-svn: 283203
* Revert "Codegen: Tail-duplicate during placement."Kyle Butt2016-10-043-321/+41
| | | | | | | | This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2. Causing crashes on aarch64 build. llvm-svn: 283172
* Codegen: Tail-duplicate during placement.Kyle Butt2016-10-043-41/+321
| | | | | | | | | | | | | | | | | | | | | The tail duplication pass uses an assumed layout when making duplication decisions. This is fine, but passes up duplication opportunities that may arise when blocks are outlined. Because we want the updated CFG to affect subsequent placement decisions, this change must occur during placement. In order to achieve this goal, TailDuplicationPass is split into a utility class, TailDuplicator, and the pass itself. The pass delegates nearly everything to the TailDuplicator object, except for looping over the blocks in a function. This allows the same code to be used for tail duplication in both places. This change, in concert with outlining optional branches, allows triangle shaped code to perform much better, esepecially when the taken/untaken branches are correlated, as it creates a second spine when the tests are small enough. llvm-svn: 283164
* fix formatting; NFCSanjay Patel2016-10-031-8/+5
| | | | llvm-svn: 283115
* Use StringRef in Registry API (NFC)Mehdi Amini2016-10-011-2/+2
| | | | llvm-svn: 283039
* Use StringRef instead of raw pointers in MCAsmInfo/MCInstrInfo APIs (NFC)Mehdi Amini2016-10-012-3/+3
| | | | llvm-svn: 283018
* Use StringRef in Datalayout API (NFC)Mehdi Amini2016-10-011-2/+2
| | | | llvm-svn: 283013
* Revert "Use StringRef in Datalayout API (NFC)"Mehdi Amini2016-10-011-1/+1
| | | | | | This reverts commit r283009. Bots are broken. llvm-svn: 283011
* Use StringRef in Datalayout API (NFC)Mehdi Amini2016-10-011-1/+1
| | | | llvm-svn: 283009
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-0123-41/+25
| | | | llvm-svn: 283004
* Remove getTargetTriple and update all uses to use the Triple offEric Christopher2016-10-011-8/+2
| | | | | | of the TargetMachine. NFC. llvm-svn: 283002
* Stop calling getTargetTriple off of the AsmPrinter and constructing aEric Christopher2016-10-011-2/+2
| | | | | | TargetTriple, just grab it off of the TargetMachine. NFC. llvm-svn: 283001
* ScheduleDAGInstrs: Cleanup, use range based for; NFCMatthias Braun2016-09-301-61/+45
| | | | llvm-svn: 282979
* [SEH] Emit the parent frame offset label even if there are no funcletsReid Kleckner2016-09-301-19/+39
| | | | | | | | | This avoids errors about references to undefined local labels from unreferenced filter functions. Fixes (sort of) PR30431 llvm-svn: 282967
* Revert "[RegAllocGreedy] Attempt to split unspillable live intervals"Dylan McKay2016-09-301-8/+6
| | | | | | It was accidentally committed. llvm-svn: 282855
* [RegAllocGreedy] Attempt to split unspillable live intervalsDylan McKay2016-09-301-6/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, when allocating unspillable live ranges, we would never attempt to split. We would always bail out and try last ditch graph recoloring. This patch changes this by attempting to split all live intervals before performing recoloring. This fixes LLVM bug PR14879. I can't add test cases for any backends other than AVR because none of them have small enough register classes to trigger the bug. Reviewers: qcolombet Subscribers: MatzeB Differential Revision: https://reviews.llvm.org/D25070 llvm-svn: 282852
* Clamp version number in S_COMPILE3 to avoid overflowing 16-bit field.Adrian McCarthy2016-09-291-5/+6
| | | | llvm-svn: 282761
* [RegisterBankInfo] Change the default mapping for Copy and PHI.Quentin Colombet2016-09-292-54/+37
| | | | | | | | | | | | | | Instead of producing a mapping for all the operands, we only generate a mapping for the definition. Indeed, the other operands are not constrained by the instruction and thus, we should leave the choice to the actual definition to do the right thing. In pratice this is almost NFC, but with one advantage. We will have only one instance of OperandsMapping for each copy and phi that map to one register bank instead of one different instance for each different number of operands for each copy and phi. llvm-svn: 282756
* [codeview] Use character types for all byte-sized integer typesReid Kleckner2016-09-291-10/+10
| | | | | | | | | | | | | | | The VS debugger doesn't appear to understand the 0x68 or 0x69 type indices, which were probably intended for use on a platform where a C 'int' is 8 bits. So, use the character types instead. Clang was already using the character types because '[u]int8_t' is usually defined in terms of 'char'. See the Rust issue for screenshots of what VS does: https://github.com/rust-lang/rust/issues/36646 Fixes PR30552 llvm-svn: 282739
* MachineFunction: Add missing newline in debug print()Matthias Braun2016-09-291-0/+1
| | | | | | Should not be a functional but an aesthetic change. llvm-svn: 282669
* [RegisterBankInfo] Uniquely generate OperandsMapping.Quentin Colombet2016-09-281-8/+63
| | | | | | | | | | | This is a step toward statically allocate InstructionMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. This should already improve compile time by getting rid of a bunch of memmove of SmallVectors. llvm-svn: 282643
* [RegisterBankInfo] Rework the APIs of ValueMapping.Quentin Colombet2016-09-281-10/+12
| | | | | | | This is a preparatory commit for more TableGen-like structure. NFC llvm-svn: 282642
* Remove dead code from LiveDebugVariables.cpp (NFC)Adrian Prantl2016-09-281-44/+10
| | | | | | | | LiveDebugVariables doesn't propagate DBG_VALUEs accross basic block boundaries any more; this functionality was split into LiveDebugValues. We can thus drop the now dead references to LexicalScopes from LiveDebugVariables. llvm-svn: 282638
* IfConversion: Add implicit uses for redefined regs with live subregistersKrzysztof Parzyszek2016-09-281-0/+11
| | | | | | | | | | Normally, if conversion would add implicit uses for redefined registers, e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of R0 are known to be live but not R0 itself, such implicit uses will not be added, causing prior definitions of such subregisters and R0 itself to become dead. llvm-svn: 282626
* Teach LiveDebugValues about lexical scopes.Adrian Prantl2016-09-281-8/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses PR26055 LiveDebugValues is very slow. Contrary to the old LiveDebugVariables pass LiveDebugValues currently doesn't look at the lexical scopes before inserting a DBG_VALUE intrinsic. This means that we often propagate DBG_VALUEs much further down than necessary. This is especially noticeable in large C++ functions with many inlined method calls that all use the same "this"-pointer. For example, in the following code it makes no sense to propagate the inlined variable a from the first inlined call to f() into any of the subsequent basic blocks, because the variable will always be out of scope: void sink(int a); void __attribute((always_inline)) f(int a) { sink(a); } void foo(int i) { f(i); if (i) f(i); f(i); } This patch reuses the LexicalScopes infrastructure we have for LiveDebugVariables to take this into account. The effect on compile time and memory consumption is quite noticeable: I tested a benchmark that is a large C++ source with an enormous amount of inlined "this"-pointers that would previously eat >24GiB (most of them for DBG_VALUE intrinsics) and whose compile time was dominated by LiveDebugValues. With this patch applied the memory consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues. https://reviews.llvm.org/D24994 Thanks to Daniel Berlin and Keith Walker for reviewing! llvm-svn: 282611
* Rewrite loops to use range-based for. (NFC)Adrian Prantl2016-09-281-17/+5
| | | | llvm-svn: 282608
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-09-282-121/+272
| | | | | | | | UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT llvm-svn: 282604
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-09-282-272/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 282600
* [DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombineMichael Kuperstein2016-09-281-7/+0
| | | | | | | | | | | | This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 llvm-svn: 282567
* [TargetRegisterInfo, AArch64] Add target hook for isConstantPhysReg().Geoff Berry2016-09-271-1/+5
| | | | | | | | | | | | | | | | | | | Summary: The current implementation of isConstantPhysReg() checks for defs of physical registers to determine if they are constant. Some architectures (e.g. AArch64 XZR/WZR) have registers that are constant and may be used as destinations to indicate the generated value is discarded, preventing isConstantPhysReg() from returning true. This change adds a TargetRegisterInfo hook that overrides the no defs check for cases such as this. Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy Subscribers: junbuml, aemerson, mcrosier, rengolin Differential Revision: https://reviews.llvm.org/D24570 llvm-svn: 282543
* Propagate DBG_VALUE entries when there are unvisited predecessorsKeith Walker2016-09-271-10/+24
| | | | | | | | | | | | | | | | | Variables are sometimes missing their debug location information in blocks in which the variables should be available. This would occur when one or more predecessor blocks had not yet been visited by the routine which propagated the information from predecessor blocks. This is addressed by only considering predecessor blocks which have already been visited. The solution to this problem was suggested by Daniel Berlin on the LLVM developer mailing list. Differential Revision: https://reviews.llvm.org/D24927 llvm-svn: 282506
* Add support to optionally limit the size of jump tables.Evandro Menezes2016-09-262-12/+38
| | | | | | | | | | | | | | | | | | | Many high-performance processors have a dedicated branch predictor for indirect branches, commonly used with jump tables. As sophisticated as such branch predictors are, they tend to have well defined limits beyond which their effectiveness is hampered or even nullified. One such limit is the number of possible destinations for a given indirect branches that such branch predictors can handle. This patch considers a limit that a target may set to the number of destination addresses in a jump table. Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar <aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>. Differential revision: https://reviews.llvm.org/D21940 llvm-svn: 282412
* [ARM] Promote small global constants to constant poolsJames Molloy2016-09-261-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a constant is unamed_addr and is only used within one function, we can save on the code size and runtime cost of an indirection by changing the global's storage to inside the constant pool. For example, instead of: ldr r0, .CPI0 bl printf bx lr .CPI0: &format_string format_string: .asciz "hello, world!\n" We can emit: adr r0, .CPI0 bl printf bx lr .CPI0: .asciz "hello, world!\n" This can cause significant code size savings when many small strings are used in one function (4 bytes per string). This recommit contains fixes for a nasty bug related to fast-isel fallback - because fast-isel doesn't know about this optimization, if it runs and emits references to a string that we inline (because fast-isel fell back to SDAG) we will end up with an inlined string and also an out-of-line string, and we won't emit the out-of-line string, causing backend failures. It also contains fixes for emitting .text relocations which made the sanitizer bots unhappy. llvm-svn: 282387
* [X86][avx512] Fix bug in masked compress store.Ayman Musa2016-09-261-3/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D23984 llvm-svn: 282381
* [RegisterBankInfo] Constify the member of the XXXMapping maps.Quentin Colombet2016-09-241-2/+2
| | | | | | | This makes it obvious that items in those maps behave like statically created objects. llvm-svn: 282327
* [RegisterBankInfo] Add statistics for dynamic value mappings.Quentin Colombet2016-09-241-0/+8
| | | | | | | Like partial mappings, as we move toward TableGen'ed information, the number should reach zero eventually. llvm-svn: 282325
* [RegisterBankInfo] Uniquely generate ValueMapping.Quentin Colombet2016-09-241-11/+52
| | | | | | | | This is a step toward statically allocate ValueMapping. Like the previous few commits, the goal is to move toward a TableGen'ed like structure with no dynamic allocation at all. llvm-svn: 282324
* [RegisterBankInfo] Keep valid pointers for PartialMappings.Quentin Colombet2016-09-241-4/+9
| | | | | | | | | | | | | | | Previously we were using the address of the unique instance of a partial mapping in the related map to access this instance. However, when the map grows, the whole set of instances may be moved elsewhere and the previous addresses are not valid anymore. Instead, keep the address of the unique heap allocated instance of a partial mapping. Note: I did not see any actual bugs for that problem as the number of partial mappings dynamically allocated is small (<= 4). llvm-svn: 282323
OpenPOWER on IntegriCloud