summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [PGO] Ensure vp data in indexed profile always sortedXinliang David Li2016-01-081-0/+2
| | | | | | | | | Done in InstrProfWriter to eliminate the need for client code to do the sorting. The operation is done once and reused many times so it is more efficient. Update unit test to remove sorting. Also update expected output of affected tests. llvm-svn: 257145
* Remove extra whitespace. NFC.Junmo Park2016-01-081-8/+8
| | | | llvm-svn: 257144
* [PGO] Fix a bug in InstProfWriter addRecordXinliang David Li2016-01-082-16/+43
| | | | | | | | | | For a new record with weight != 1, only edge profiling counters are scaled, VP data is not properly scaled. This patch refactors the code and fixes the problem. Also added sort by count interface (for follow up patch). llvm-svn: 257143
* Remove static global GCNames from Function.cpp and move it to the ContextMehdi Amini2016-01-086-38/+35
| | | | | | | | | This remove the need for locking when deleting a function. Differential Revision: http://reviews.llvm.org/D15988 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 257139
* Add call sequence start and end for __tls_get_addrKyle Butt2016-01-081-0/+7
| | | | | | | | | | | | | | This is a fix for bug http://llvm.org/bugs/show_bug.cgi?id=25839. For a PIC TLS variable access in a function, prologue (mflr followed by std and stdu) gets scheduled after a tls_get_addr call. tls_get_addr messed up LR but no one saves/restores it. Also added a test for save/restore clobbered registers during calling __tls_get_addr. Patch by Tim Shen llvm-svn: 257137
* [Vectorization] Actually return from error case in isStridedPtrKyle Butt2016-01-081-0/+1
| | | | | | | | | | The early return seems to be missed. This causes a radical and wrong loop optimization on powerpc. It isn't reproducible on x86_64, because "UseInterleaved" is false. Patch by Tim Shen. llvm-svn: 257134
* [InstCombine] insert a new shuffle in a safe place (PR25999)Sanjay Patel2016-01-081-10/+7
| | | | | | | | Limit this transform to a basic block and guard against PHIs. Hopefully, this fixes the remaining failures in PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 llvm-svn: 257133
* [WebAssembly] Minor code cleanups. NFC.Dan Gohman2016-01-082-5/+3
| | | | llvm-svn: 257131
* IntEqClasses: Let join() return the new leaderMatthias Braun2016-01-081-1/+3
| | | | | | | | The new leader is known anyway so we can return it for some micro optimization in code where it is easy to pass along the result to the next join(). llvm-svn: 257130
* LiveInterval: A LiveRange is enough for ConnectedVNInfoEqClasses::Classify()Matthias Braun2016-01-083-7/+7
| | | | llvm-svn: 257129
* [WebAssembly] Minor code cleanups. NFC.Dan Gohman2016-01-084-11/+6
| | | | llvm-svn: 257128
* [WebAssembly] Remove an unused def : Pat.Dan Gohman2016-01-081-6/+4
| | | | | | | WebAssemblyISelLowering.cpp does not wrap jump table nodes inside of WebAssemblywrapper nodes, so this pattern is not currently used. llvm-svn: 257127
* [WebAssembly] Remove unused arguments, unused functions. NFC.Dan Gohman2016-01-084-50/+33
| | | | llvm-svn: 257125
* Add some testing for thumb1 and thumb2 inline asm immediate constraintsEric Christopher2016-01-081-2/+2
| | | | | | | | and fix a couple of bugs on inspection. Also fixes PR26061. llvm-svn: 257122
* [LiveDebugValues] Replace several lines of code with operator[].Alexey Samsonov2016-01-071-16/+2
| | | | llvm-svn: 257114
* Instructions to be redone only if from the same BBAditya Nandakumar2016-01-071-1/+2
| | | | | | | While adding instructions(possible roots) to be redone, make sure they are from the same basic block. llvm-svn: 257112
* WebAssembly: use .skip instead of .zero directiveJF Bastien2016-01-071-0/+4
| | | | | | | | | | | | | | | | | .zero is confusing when used with two arguments. Documentation: This directive emits SIZE 0-valued bytes. SIZE must be an absolute expression. This directive is actually an alias for the '.skip' directive so in can take an optional second argument of the value to store in the bytes instead of zero. Using '.zero' in this way would be confusing however. Ref: https://sourceware.org/bugzilla/show_bug.cgi?id=18353 Hexagon and Sparc do the same, and it's all the same to WebAssembly so let's pick the less confusing of the two. llvm-svn: 257111
* [PGO] Minor refactoring /NFCXinliang David Li2016-01-071-5/+1
| | | | | | Move common defs into common header files. llvm-svn: 257108
* Temporarily revert r257105 "[Verifier] Check that debug values have proper size"Keno Fischer2016-01-072-64/+16
| | | | | | | Looks like there's a case where clang generates debug info that triggers the new verifier check. Reverting while investigating. llvm-svn: 257107
* [Verifier] Check that debug values have proper sizeKeno Fischer2016-01-072-16/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Teach the Verifier to make sure that the storage size given to llvm.dbg.declare or the value size given to llvm.dbg.value agree with what is declared in DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA). Additionally this catches a number of common mistakes, such as passing a pointer when a value was intended or vice versa. One complication comes from stack coloring which modifies the original IR when it merges allocas in order to make sure that if AA falls back to the IR it gets the correct result. However, given this new invariant, indiscriminately replacing one alloca by a different (differently sized one) is no longer valid. Fix this by just undefing out any use of the alloca in a dbg.declare in this case. Additionally, I had to fix a number of test cases. Of particular note: - I regenerated dbg-changes-codegen-branch-folding.ll from the given source as it was affected by the bug fixed in r256077 - two-cus-from-same-file.ll was changed to avoid having a variable-typed debug variable as that would depend on the target, even though this test is supposed to be generic - I had to manually declared size/align for reference type. See also the discussion for D14275/r253186. - fpstack-debuginstr-kill.ll required changing `double` to `long double` - most others were just a question of adding OP_deref Reviewers: aprantl Differential Revision: http://reviews.llvm.org/D14276 llvm-svn: 257105
* Turn off lldb debug tuning by default for FreeBSDDimitry Andric2016-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | Summary: In rL242338, debugger tuning was introduced, and the tuning for FreeBSD was set to lldb by default. However, for the foreseeable future we still need to default to gdb tuning, since lldb is not ready for all of FreeBSD's architectures, and some system tools (like objcopy, etc) have not yet been adapted to cope with the lldb tuned format, which has .apple sections. Therefore, let FreeBSD use gdb by default for now. Reviewers: emaste, probinson Subscribers: llvm-commits, emaste Differential Revision: http://reviews.llvm.org/D15966 llvm-svn: 257103
* [SCCP] Don't violate the lattice invariantsDavid Majnemer2016-01-071-15/+42
| | | | | | | | | | We marked values which are 'undef' as constant instead of undefined which violates SCCP's invariants. If we can figure out that a computation results in 'undef', leave it in the undefined state. This fixes PR16052. llvm-svn: 257102
* WebAssembly: update expected failures, more assert got resolved.JF Bastien2016-01-071-2/+0
| | | | llvm-svn: 257098
* Fix crash when printing instructions that have a metadata attached but no ↵Mehdi Amini2016-01-071-1/+1
| | | | | | | | | | | | | | | | | | parent. Fix PR24852 (crash with -debug -instcombine) Patch by Than McIntosh <thanm@google.com> Summary: Add guards to the asm writer to prevent crashing when dumping an instruction that has no basic block. Differential Revision: http://reviews.llvm.org/D15798 From: Than McIntosh <thanm@google.com> llvm-svn: 257094
* WebAssembly: update expected failures, assert got resolved by r257084.JF Bastien2016-01-071-18/+0
| | | | llvm-svn: 257093
* [PGO] Simplify coverage mapping loweringXinliang David Li2016-01-071-23/+11
| | | | | | | | | | | | | | | | Coverage mapping data may reference names of functions that are skipped by FE (e.g, unused inline functions). Since those functions are skipped, normal instr-prof function lowering pass won't put those names in the right section, so special handling is needed to walk through coverage mapping structure and recollect the references. With this patch, only names that are skipped are processed. This simplifies the lowering code and it no longer needs to make assumptions coverage mapping data layout. It should also be more efficient. llvm-svn: 257091
* Remove junk accidentally commited with r257087David Majnemer2016-01-071-1/+1
| | | | llvm-svn: 257089
* [SCCP] Can't go from overdefined to constantDavid Majnemer2016-01-071-3/+3
| | | | | | | | The fix for PR23999 made us mark loads of null as producing the constant undef which upsets the lattice. Instead, keep the load as "undefined". This fixes PR26044. llvm-svn: 257087
* [WebAssembly] Support combining GEP and FrameIndex offsets in memory operand ↵Derek Schuff2016-01-071-6/+12
| | | | | | | | | | | offset field Previously we only supported putting the FI into memory operand offset fields if there was nothing there already. Now combine them. Differential Revision: http://reviews.llvm.org/D15941 llvm-svn: 257084
* [WebAssembly] Use the default private label prefixes.Dan Gohman2016-01-071-3/+0
| | | | | | | | | The MC assembler doesn't like using the empty string as a private label prefix because then it treats all labels as private. This commit reverts back to the default prefix, which is .L, which is common in ELF targets and consistent with the LLVM name mangler. llvm-svn: 257083
* AMDGPU/SI: Fold operands with sub-registersNicolai Haehnle2016-01-074-9/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 llvm-svn: 257074
* AMDGPU/SI: xnack_mask is always reserved on VINicolai Haehnle2016-01-072-36/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Somehow, I first interpreted the docs as saying space for xnack_mask is only reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and went back to actually test what is happening, and it turns out that xnack_mask is always reserved at least on Tonga and Carrizo, in the sense that flat_scr is always fixed below the SGPRs that are used to implement xnack_mask, whether or not they are actually used. I confirmed this by writing a shader using inline assembly to tease out the aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so xnack_mask is s[76:77] and vcc is s[78:79]). This patch changes both the calculation of the total number of SGPRs and the various register reservations to account for this. It ought to be possible to use the gap left by xnack_mask when the feature isn't used, but this patch doesn't try to do that. (Note that the same applies to vcc.) Note that previously, even before my earlier change in r256794, the SGPRs that alias to xnack_mask could end up being used as well when flat_scr was unused and the total number of SGPRs happened to fall on the right alignment (e.g. highest regular SGPR being used s29 and VCC used would lead to number of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there were some conflict due to such aliasing, we should have noticed that already. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15898 llvm-svn: 257073
* [AVX512] add PSLLW and PSLLV IntrinsicMichael Zuckerman2016-01-071-0/+13
| | | | | | Differential Revision: http://reviews.llvm.org/D15889 llvm-svn: 257070
* Revert r257064. It caused failures in some sanitizer tests.Silviu Baranga2016-01-071-322/+3
| | | | llvm-svn: 257069
* Fix build after r257064: we should be returning false, not nullptrSilviu Baranga2016-01-071-2/+2
| | | | llvm-svn: 257067
* Revert r257055, it caused PR26064.Nico Weber2016-01-071-7/+2
| | | | llvm-svn: 257066
* [InstCombine] Look through PHIs, GEPs, IntToPtrs and PtrToInts to expose ↵Silviu Baranga2016-01-071-3/+322
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | more constants when comparing GEPs Summary: When comparing two GEP instructions which have the same base pointer and one of them has a constant index, it is possible to only compare indices, transforming it to a compare with a constant. This removes one use for the GEP instruction with the constant index, can reduce register pressure and can sometimes lead to removing the comparisson entirely. InstCombine was already doing this when comparing two GEPs if the base pointers were the same. However, in the case where we have complex pointer arithmetic (GEPs applied to GEPs, PHIs of GEPs, conversions to or from integers, etc) the value of the original base pointer will be hidden to the optimizer and this transformation will be disabled. This change detects when the two sides of the comparison can be expressed as GEPs with the same base pointer, even if they don't appear as such in the IR. The transformation will convert all the pointer arithmetic to arithmetic done on indices and all the relevant uses of GEPs to GEPs with a common base pointer. The GEP comparison will be converted to a comparison done on indices. Reviewers: majnemer, jmolloy Subscribers: hfinkel, jevinskie, jmolloy, aadg, llvm-commits Differential Revision: http://reviews.llvm.org/D15146 llvm-svn: 257064
* [AVX512] add PSRAV IntrinsicMichael Zuckerman2016-01-071-0/+7
| | | | | | Differential Revision: http://reviews.llvm.org/D15856 llvm-svn: 257063
* Added support for macro emission in dwarf (supporting DWARF version 4).Amjad Aboud2016-01-077-4/+140
| | | | | | Differential Revision: http://reviews.llvm.org/D15495 llvm-svn: 257060
* [GlobalsAA] Partially back out r248576James Molloy2016-01-071-15/+0
| | | | | | | | | | | | | | | | | | | See PR25822 for a more full summary, but we were conflating the concepts of "capture" and "escape". We were proving nocapture and using that proof to infer noescape, which is not true. Escaped-ness is a function-local property - as soon as a value is used in a call argument it escapes. Capturedness is a related but distinct property. It implies a *temporally limited* escape. Consider: static int a; int b; int g(int * nocapture arg); int f() { a = 2; // Even though a escapes to g, it is not captured so can be treated as non-escaping here. g(&a); // But here it must be treated as escaping. g(&b); // Now that g(&a) has returned we know it was not captured so we can treat it as non-escaping again. } The original commit did not sufficiently understand this nuance and so caused PR25822 and PR26046. r248576 included both a performance improvement (which has been backed out) and a related conformance fix (which has been kept along with its testcase). llvm-svn: 257058
* [AVX512] add PSHUFHW and PSHUFLW Intrinsic Michael Zuckerman2016-01-071-0/+12
| | | | | | Differential Revision: http://reviews.llvm.org/D15925 llvm-svn: 257056
* [X86][AVX] Match broadcast loads through a bitcastSimon Pilgrim2016-01-071-2/+7
| | | | | | | | AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through bitcasts to check for a load node to allow broadcasts to occur. Follow up to D15310 llvm-svn: 257055
* Added AVRTargetObjectFile class and AVR.hDylan McKay2016-01-074-0/+130
| | | | llvm-svn: 257049
* Mark arm as the 32bit variant of aarch64 in TripleTamas Berghammer2016-01-071-28/+28
| | | | | | | | | | Change Triple::get32BitArchVariant to return arm/armeb as the 32bit variant of aarch64/aarch64_be and do the same change for the oppoiste direction in Triple::get64BitArchVariant. Differential revision: http://reviews.llvm.org/D15529 llvm-svn: 257048
* Remove extra whitespace. NFC.Junmo Park2016-01-071-1/+1
| | | | llvm-svn: 257047
* [X86][SSE} Add INSERTPS as a target shuffleSimon Pilgrim2016-01-071-3/+17
| | | | | | Follow up to D15378, added INSERTPS to the list of decodable target shuffles and enabled XFormVExtractWithShuffleIntoLoad to handle target shuffles with SentinelZero and tested this with INSERTPS. llvm-svn: 257046
* [AVX512] add PSHUFD IntrinsicMichael Zuckerman2016-01-071-0/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D15934 llvm-svn: 257044
* ARM: support TLS accesses on Darwin platformsTim Northover2016-01-0710-8/+132
| | | | | | | | Darwin TLS accesses most closely resemble ELF's general-dynamic situation, since they have to be able to handle all possible situations. The descriptors and so on are obviously slightly different though. llvm-svn: 257039
* [SystemZ] Add hasSideEffects flag on Serialize instruction.Jonas Paulsson2016-01-071-0/+3
| | | | | | | | | | Serialize will perform a hardware serialization operation, and is acting as a memory barrier. Therefore it must have the hasSideEffects flag set so it will be treated as a global memory object. Reviewed by Ulrich Weigand llvm-svn: 257036
* [X86] Remove superfluous mayLoad flag. The pattern already implies it.Craig Topper2016-01-071-3/+1
| | | | llvm-svn: 257035
OpenPOWER on IntegriCloud