summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [PPC64LE] Teach swap optimization about the doubleword splat idiomBill Schmidt2015-07-021-12/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With a previous patch, the VSX swap optimization is able to recognize the doubleword load-splat idiom that can be implemented using lxvdsx. However, that does not cover a doubleword splat where the source is a register. We can implement this using xxspltd (a special form of xxpermdi). This patch teaches the swap optimization pass about this idiom. As a prerequisite, it also permits swap optimization to succeed for all forms of SUBREG_TO_REG. Previously we were conservative and only allowed SUBREG_TO_REG when it copied a full register. However, on reflection any form of SUBREG_TO_REG is safe in and of itself, so long as an unsafe operation is not performed on its result. In particular, a widening SUBREG_TO_REG often occurs as an input to a doubleword splat idiom, particularly in auto-vectorized code. The doubleword splat idiom is an XXPERMDI operation where both source registers are identical, and the selection mask is either 0 (splat the first element) or 3 (splat the second element). To determine whether the registers are identical, we use the existing mechanism for looking through "copy-like" operations. That mechanism has a side effect of marking the XXPERMDI operation as using a physical register, which would invalidate its presence in a swap-optimized region. This is correct for the form of XXPERMDI that performs a swap and hence would be removed, but is not what we want for a doubleword-splat variety of XXPERMDI. Therefore we reset the physical-register flag on the XXPERMDI when it represents a splat. A simple test case is added to verify that we generate the splat and that we also remove the xxswapd instructions that would otherwise be associated with the load and store of another operand. llvm-svn: 241285
* Convert a member variable to a local one.Rafael Espindola2015-07-021-3/+3
| | | | llvm-svn: 241284
* Fix for PR23310: llvm-dis crashes when trying to upgrade an intrinsic.Rafael Espindola2015-07-021-3/+3
| | | | | | | | | | | | When trying to upgrade @llvm.x86.sse2.psrl.dq while parsing a module, BitcodeReader adds the function to its worklist twice, resulting in a crash when accessing it the second time. This patch replaces the worklist vector by a map. Patch by Philip Pfaffe. llvm-svn: 241281
* Rangify some loops.Rafael Espindola2015-07-021-17/+13
| | | | | | Patch by Philip Pfaffe! llvm-svn: 241279
* [Support] Lazy load of dbghlp.dll on WindowsLeny Kholodov2015-07-023-41/+54
| | | | | | | | | | | | | This patch changes linkage with dbghlp.dll for clang from static (at load time) to on demand (at the first use of required functions). Clang uses dbghlp.dll only in minor use-cases. First of all in case of crash and in case of plugin load. The dbghlp.dll library can be absent on system. In this case clang will fail to load. With lazy load of dbghlp.dll clang can work even if dbghlp.dll is not available. Differential Revision: http://reviews.llvm.org/D10737 llvm-svn: 241271
* Remove whitespace from start of line, NFC.Yaron Keren2015-07-021-2/+2
| | | | llvm-svn: 241268
* Delete whitespace at start of line.Yaron Keren2015-07-021-1/+1
| | | | llvm-svn: 241265
* Reapply r240291: Fix shl folding in DAG combiner.Pawel Bylica2015-07-021-1/+1
| | | | | | | | The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared. It has been reverted previously because of some problems with comparing APInt with raw uint64_t. That has been fixed/changed with r241204. llvm-svn: 241254
* [GraphWriter] Don't wait on xdg-open when not on Apple.Charlie Turner2015-07-021-1/+1
| | | | | | | | | | | | | | | | | | | | By default, the GraphWriter code assumes that the generic file open program (`open` on Apple, `xdg-open` on other systems) can wait on the forked proces to complete. When the fork ends, the code would delete the temporary dot files created, and return. On GNU/Linux, the xdg-open program does not have a "wait for your fork to complete before dying" option. So the behaviour was that xdg-open would launch a process, quickly die itself, and then the GraphWriter code would think its OK to quickly delete all the temporary files. Once the temporary files were deleted, the dot viewers would get very upset, and often give you weird errors. This change only waits on the generic open program on Apple platforms. Elsewhere, we don't wait on the process, and hence we don't try and clean up the temporary files. llvm-svn: 241250
* [NFC] Make the Statepoint class more like CallSiteSanjoy Das2015-07-021-3/+3
| | | | | | | | | | Summary: Rename some methods to make Statepoint look more like CallSite. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10756 llvm-svn: 241235
* Implement TargetTransformInfo::hasCompatibleFunctionAttributes for X86.Eric Christopher2015-07-022-0/+17
| | | | | | | | | | | | | | | | | | | This checks subtarget feature compatibility for inlining by verifying that the callee is a strict subset of the caller's features. This includes the cpu as part of the subtarget we can get via the incoming functions as the backend takes CPUs as feature sets. This allows us to inline things like: int foo() { return baz(); } int __attribute__((target("sse4.2"))) bar() { return foo(); } so that generic code can be inlined into specialized functions. llvm-svn: 241221
* Add a routine to TargetTransformInfo that will allow targets to lookEric Christopher2015-07-022-4/+10
| | | | | | | at the attributes on a function to determine whether or not to allow inlining. llvm-svn: 241220
* WebAssembly: start instructionsJF Bastien2015-07-018-13/+31
| | | | | | | | | | | | | | | | | | | | Summary: * Add 64-bit address space feature. * Rename SIMD feature to SIMD128. * Handle single-thread model with an IR pass (same way ARM does). * Rename generic processor to MVP, to follow design's lead. * Add bleeding-edge processors, with all features included. * Fix a few DEBUG_TYPE to match other backends. Test Plan: ninja check Reviewers: sunfish Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D10880 llvm-svn: 241211
* [TwoAddressInstructionPass] Try 3 Addr Conversion After Commuting.Quentin Colombet2015-07-011-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | TwoAddressInstructionPass stops after a successful commuting but 3 Addr conversion might be good for some cases. Consider: int foo(int a, int b) { return a + b; } Before this commit, we emit: addl %esi, %edi movl %edi, %eax ret After this commit, we try 3 Addr conversion: leal (%rsi,%rdi), %eax ret Patch by Volkan Keles <vkeles@apple.com>! Differential Revision: http://reviews.llvm.org/D10851 llvm-svn: 241206
* [LoopVectorize] Use ReplaceInstWithInst() helper where appropriate.Alexey Samsonov2015-07-011-22/+15
| | | | | | | | | | This is mostly an NFC, which increases code readability (instead of saving old terminator, generating new one in front of old, and deleting old, we just call a function). However, it would additionaly copy the debug location from old instruction to replacement, which would help PR23837. llvm-svn: 241197
* Pack MCSymbol::Flags in to the bitfield with other members. NFC.Pete Cooper2015-07-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All file formats only needed 16-bits right now which is enough to fit in to the padding with other fields. This reduces the size of MCSymbol to 24-bytes on a 64-bit system. The layout is now 0 | class llvm::MCSymbol 0 | class llvm::PointerIntPair SectionOrFragmentAndHasName 0 | intptr_t Value | [sizeof=8, dsize=8, align=8 | nvsize=8, nvalign=8] 8 | unsigned int IsTemporary 8 | unsigned int IsRedefinable 8 | unsigned int IsUsed 8 | _Bool IsRegistered 8 | unsigned int IsExternal 8 | unsigned int IsPrivateExtern 8 | unsigned int Kind 9 | unsigned int IsUsedInReloc 9 | unsigned int SymbolContents 9 | unsigned int CommonAlignLog2 10 | uint32_t Flags 12 | uint32_t Index 16 | union 16 | uint64_t Offset 16 | uint64_t CommonSize 16 | const class llvm::MCExpr * Value | [sizeof=8, dsize=8, align=8 | nvsize=8, nvalign=8] | [sizeof=24, dsize=24, align=8 | nvsize=24, nvalign=8] llvm-svn: 241196
* [WebAssembly] Define separate Target instances for 32-bit and 64-bit.Dan Gohman2015-07-014-6/+9
| | | | llvm-svn: 241193
* [NVPTX] expand extload/truncstore for vectors of floatsJingyue Wu2015-07-011-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: According to PTX ISA: For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type. So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of. As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like ld.v2.f32 {%fd3, %fd4}, [%rd2] This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated. Patched by Gang Hu. Test Plan: Test case attached. Reviewers: jingyue Reviewed By: jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D10876 llvm-svn: 241191
* Encode MCSymbol alignment as log2(align).Pete Cooper2015-07-011-0/+2
| | | | | | | | | | | | | Given that alignments are always powers of 2, just encode it this way. This matches how we encode alignment on IR GlobalValue's for example. This compresses the CommonAlign member down to 5 bits which allows it to pack better with the surrounding fields. Reviewed by Duncan Exon Smith. llvm-svn: 241189
* [WinEH] Use llvm.x86.seh.recoverfp in WinEHPrepareReid Kleckner2015-07-011-40/+48
| | | | | | | | | | | Don't pattern match for frontend outlined finally calls on non-x64 platforms. The 32-bit runtime uses a different funclet prototype. Now, the frontend is pre-outlining the finally bodies so that it ends up doing most of the heavy lifting for variable capturing. We're just outlining the callsite, and adapting the frameaddress(0) call to line up the frame pointer recovery. llvm-svn: 241186
* [NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPassJingyue Wu2015-07-012-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Offset of frame index is calculated by NVPTXPrologEpilogPass. Before that the correct offset of stack objects cannot be obtained, which leads to wrong offset if there are more than 2 frame objects. This patch move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check VRFrame register instead, and try to remove the VRFrame if there is no usage after NVPTXPeephole pass. Patched by Xuetian Weng. Test Plan: Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the offset calculation based on SP and SPL. Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10853 llvm-svn: 241185
* [PPC64LE] Enable missing lxvdsx optimization, and related swap optimizationBill Schmidt2015-07-012-13/+11
| | | | | | | | | | | | | | | | | | | | | | | When adding little-endian vector support for PowerPC last year, I inadvertently disabled an optimization that recognizes a load-splat idiom and generates the lxvdsx instruction. This patch moves the offending logic so lxvdsx is once again generated. This pattern is frequently generated by the vectorizer for scalar loads of an effective constant. Previously the lxvdsx instruction was wrongly listed as lane-sensitive for the VSX swap optimization (since both doublewords are identical, swaps are safe). This patch fixes this as well, so that vectorized code using lxvdsx can now have swaps removed from the computation. There is an existing test (@test50) in test/CodeGen/PowerPC/vsx.ll that checks for the missing optimization. However, vsx.ll was only being tested for POWER7 with big-endian code generation. I've added a little-endian RUN statement and expected LE code generation for all the tests in vsx.ll to give us a bit better VSX coverage, including what's needed for this patch. llvm-svn: 241183
* add a cl::opt override for TargetLoweringBase's JumpIsExpensiveSanjay Patel2015-07-011-1/+12
| | | | | | | | | | | | | This patch is not intended to change existing codegen behavior for any target. It just exposes the JumpIsExpensive setting on the command-line to allow for easier testing and emergency overrides. Also, change the existing regression test to use FileCheck, explicitly specify the jump-is-expensive option, and use more precise checks. Differential Revision: http://reviews.llvm.org/D10846 llvm-svn: 241179
* Revert "[DWARF] Fix debug info generation for function static variables, ↵David Blaikie2015-07-016-62/+35
| | | | | | | | | | typedefs, and records" Caused PR24008 This reverts commit 37cb5f1c2db9f42d29f26b215585f56bb64ae4f5. llvm-svn: 241176
* fix formatting; NFCSanjay Patel2015-07-011-2/+2
| | | | llvm-svn: 241175
* fix typos in comment; NFCSanjay Patel2015-07-011-3/+2
| | | | llvm-svn: 241174
* LivePhysRegs: Add support to add pristine registers when populating with ↵Matthias Braun2015-07-011-0/+41
| | | | | | | | live-in/live-out registers. Differential Revision: http://reviews.llvm.org/D10139 llvm-svn: 241172
* [SEH] Don't assert if the parent function lacks a personalityReid Kleckner2015-07-011-0/+6
| | | | | | | | The EH code might have been deleted as unreachable and the personality pruned while the filter is still present. Currently I'm hitting this at -O0 due to the clang bug PR24009. llvm-svn: 241170
* [AsmPrinter] Hide implementation detailsBenjamin Kramer2015-07-014-6/+6
| | | | | | NFC. llvm-svn: 241169
* [AArch64] Implement add/adds/sub/subs/cmp/cmn with negative immediate aliasesArnaud A. de Grandmaison2015-07-013-10/+78
| | | | | | | | | | | | | | | | | | | | | | | This patch teaches the AsmParser to accept add/adds/sub/subs/cmp/cmn with a negative immediate operand and convert them as shown: add Rd, Rn, -imm -> sub Rd, Rn, imm sub Rd, Rn, -imm -> add Rd, Rn, imm adds Rd, Rn, -imm -> subs Rd, Rn, imm subs Rd, Rn, -imm -> adds Rd, Rn, imm cmp Rn, -imm -> cmn Rn, imm cmn Rn, -imm -> cmp Rn, imm Those instructions are an alternate syntax available to assembly coders, and are needed in order to support code already compiling with some other assemblers (gas). They are documented in the "ARMv8 Instruction Set Overview", in the "Arithmetic (immediate)" section. This makes llvm-mc a programmer-friendly assembler ! This also fixes PR20978: "Assembly handling of adding negative numbers not as smart as gas". llvm-svn: 241166
* [SDAG] Give InstrEmitter hidden visibilityBenjamin Kramer2015-07-011-1/+1
| | | | | | NFC. llvm-svn: 241165
* [CodeGen] Reduce visibility of implementation detailsBenjamin Kramer2015-07-018-11/+11
| | | | | | NFC. llvm-svn: 241164
* [Sparc] Rearrange SparcInstrInfo, no change.James Y Knight2015-07-011-68/+80
| | | | | | | | | Move some instructions into order of sections in the spec, as the rest already were. Differential Revision: http://reviews.llvm.org/D9102 llvm-svn: 241163
* AVX-512: Implemented missing encoding for FMA scalar instructionsIgor Breger2015-07-011-34/+95
| | | | | | | | Added tests for encoding Differential Revision: http://reviews.llvm.org/D10865 llvm-svn: 241159
* Return ErrorOr from getSection.Rafael Espindola2015-07-011-0/+2
| | | | | | | | | | | | | | This also improves the logic of what is an error: * getSection(uint_32): only return an error if the index is out of bounds. The index 0 corresponds to a perfectly valid entry. * getSection(Elf_Sym): Returns null for symbols that normally don't have sections and error for out of bound indexes. In many places this just moves the report_fatal_error up the stack, but those can then be fixed in smaller patches. llvm-svn: 241156
* [DWARF] Fix debug info generation for function static variables, typedefs, ↵Michael Kuperstein2015-07-016-35/+62
| | | | | | | | | | | | | | | and records Function static variables, typedefs and records (class, struct or union) declared inside a lexical scope were associated with the function as their parent scope, rather than the lexical scope they are defined or declared in. This fixes PR19238 Patch by: amjad.aboud@intel.com Differential Revision: http://reviews.llvm.org/D9758 llvm-svn: 241153
* [X86] Avoid over-relaxation of 8-bit immediates in integer arithmetic ↵Michael Kuperstein2015-07-011-24/+6
| | | | | | | | | | | | | | | | | | instructions. Only consider an instruction a candidate for relaxation if the last operand of the instruction is an expression. We previously checked whether any operand is an expression, which is useless, since for all instructions concerned, the only operand that may be affected by relaxation is the last one. In addition, this removes the check for having RIP as an argument, since it was plain wrong - even when one of the arguments is RIP, relaxation may still be needed. This fixes PR9807. Patch by: david.l.kreitzer@intel.com Differential Revision: http://reviews.llvm.org/D10766 llvm-svn: 241152
* [mips][microMIPS] Implement SLL and NOP instructionsZoran Jovanovic2015-07-013-0/+21
| | | | | | http://reviews.llvm.org/D10474 llvm-svn: 241150
* Fix PR23872: Integrated assembler error message when using .type directive ↵Gabor Ballabas2015-07-011-4/+10
| | | | | | | | | | | with @ in AArch32 assembly. The AArch32 assembler parses the '@' as a comment symbol, so the error message shouldn't suggest that '@<type>' is a valid replacement when assembling for AArch32 target. Differential Revision: http://reviews.llvm.org/D10651 llvm-svn: 241149
* [LoopUnroll] Use undef for phis with no value liveDavid Majnemer2015-07-011-1/+1
| | | | | | | | We would create a phi node with a zero initialized operand instead of undef in the case where no value was originally available. This was problematic for x86_mmx which has no null value. llvm-svn: 241143
* [SCCP] Turn loads of null into undef instead of zero initialized valuesDavid Majnemer2015-07-011-1/+1
| | | | | | | | | | Surprisingly, this is a correctness issue: the mmx type exists for calling convention purposes, LLVM doesn't have a zero representation for them. This partially fixes PR23999. llvm-svn: 241142
* [NaryReassociate] enhances nsw by leveraging @llvm.assumeJingyue Wu2015-07-011-14/+55
| | | | | | | | | | | | | | | | | | | | | Summary: nsw are flaky and can often be removed by optimizations. This patch enhances nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate now understands that assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE. Test Plan: nary-gep.ll Reviewers: broune, meheff Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10822 llvm-svn: 241139
* [SanitizerCoverage] Don't add instrumentation to unreachable blocks.Alexey Samsonov2015-06-301-0/+7
| | | | llvm-svn: 241127
* [SEH] Add new intrinsics for recovering and restoring parent framesReid Kleckner2015-06-304-48/+114
| | | | | | | | | | | | | | | | | | | | | | | | | The incoming EBP value established by the runtime is actually a pointer to the end of the EH registration object, and not the true parent function frame pointer. Clang doesn't need llvm.x86.seh.exceptioninfo anymore because we know that the exception info pointer is at a fixed offset from this incoming EBP. The llvm.x86.seh.recoverfp intrinsic takes an EBP value provided by the EH runtime and returns a pointer that is usable with llvm.framerecover. The llvm.x86.seh.restoreframe intrinsic is inserted by the 32-bit specific preparation pass in blocks targetted by the EH runtime. It re-establishes any physical registers used by the parent function to address the stack, such as the frame, base, and stack pointers. Neither of these intrinsics correctly handle stack realignment prologues yet, but it's possible to add that later. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D10848 llvm-svn: 241125
* [Cloning] Teach CloneModule about personality functionsDavid Majnemer2015-06-301-0/+4
| | | | | | | | | CloneModule didn't take into account that it needed to remap the value using values in the module. This fixes PR23992. llvm-svn: 241122
* [NVPTX] cleanups and refacotring in NVPTXFrameLowering.cppJingyue Wu2015-06-301-27/+19
| | | | | | | | | | | | | | | | Summary: NFC Test Plan: no regression Reviewers: wengxt Reviewed By: wengxt Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10849 llvm-svn: 241118
* [FaultMaps] Let the frontend pre-select implicit null check candidates.Sanjoy Das2015-06-301-0/+7
| | | | | | | | | | | | | | | | | | Summary: This change introduces a !make.implicit metadata that allows the frontend to pre-select the set of explicit null checks that will be considered for transformation into implicit null checks. The reason for not using profiling data instead of !make.implicit is explained in the change to `FaultMaps.rst`. Reviewers: atrick, reames, pgavlin, JosephTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10824 llvm-svn: 241116
* Pack MCSymbol::HasName in to a spare bit in the section/fragment union.Pete Cooper2015-06-301-1/+1
| | | | | | | | | | | | This is part of an effort to pack the average MCSymbol down to 24 bytes. The HasName bit was pushing the size of the bitfield over to another word, so this change uses a PointerIntPair to fit in it to unused bits of a PointerUnion. Reviewed by Rafael Espíndola llvm-svn: 241115
* Use ErrorOr in getRelocationAdress.Rafael Espindola2015-06-304-13/+9
| | | | | | | We can probably do better in this method, but this is an improvement and enables further ErrorOr cleanups. llvm-svn: 241114
* Implement containsSymbol with other lower level methods.Rafael Espindola2015-06-303-23/+7
| | | | llvm-svn: 241112
OpenPOWER on IntegriCloud