summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* New test case: make sure alloc bit is not set for covmap section on LinuxXinliang David Li2016-02-171-0/+25
| | | | llvm-svn: 261038
* [WebAssembly] Use SDValue::getConstantOperandVal. NFC.Dan Gohman2016-02-171-1/+1
| | | | llvm-svn: 261037
* Fix MSVC bot: apparently visual studio does not like explicitly defaulted ↵Mehdi Amini2016-02-171-1/+3
| | | | | | | move ctor From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 261036
* Fix build LLVM with -D LLVM_USE_INTEL_JITEVENTS:BOOL=ON on WindowsAndrew Kaylor2016-02-162-2/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D16940 llvm-svn: 261033
* [WebAssembly] Implement __builtin_frame_address.Dan Gohman2016-02-165-8/+54
| | | | | | Differential Revision: http://reviews.llvm.org/D17307 llvm-svn: 261032
* Query the StringMap only once when creating MDString (NFC)Mehdi Amini2016-02-162-12/+7
| | | | | | | | | | | | | Summary: Loading IR with debug info improves MDString::get() from 19ms to 10ms. Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16597 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 261030
* Define the ThinLTO Pipeline (experimental)Mehdi Amini2016-02-162-2/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: On the contrary to Full LTO, ThinLTO can afford to shift compile time from the frontend to the linker: both phases are parallel (even if it is not totally "free": projects like clang are reusing product from the "compile phase" for multiple link, think about libLLVMSupport reused for opt, llc, etc.). This pipeline is based on the proposal in D13443 for full LTO. We didn't move forward on this proposal because the LTO link was far too long after that. We believe that we can afford it with ThinLTO. The ThinLTO pipeline integrates in the regular O2/O3 flow: - The compile phase perform the inliner with a somehow lighter function simplification. (TODO: tune the inliner thresholds here) This is intendend to simplify the IR and get rid of obvious things like linkonce_odr that will be inlined. - The link phase will run the pipeline from the start, extended with some specific passes that leverage the augmented knowledge we have during LTO. Especially after the inliner is done, a sequence of globalDCE/globalOpt is performed, followed by another run of the "function simplification" passes. It is not clear if this part of the pipeline will stay as is, as the split model of ThinLTO does not allow the same benefit as FullLTO without added tricks. The measurements on the public test suite as well as on our internal suite show an overall net improvement. The binary size for the clang executable is reduced by 5%. We're still tuning it with the bringup of ThinLTO and it will evolve, but this should provide a good starting point. Reviewers: tejohnson Differential Revision: http://reviews.llvm.org/D17115 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 261029
* Refactor the PassManagerBuilder: extract a ↵Mehdi Amini2016-02-162-72/+77
| | | | | | | | | | "addFunctionSimplificationPasses()" (NFC) It is intended to contains the passes run over a function after the inliner is done with a function and before it moves to its callers. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 261028
* Fix test from r261013Adam Nemet2016-02-161-0/+1
| | | | llvm-svn: 261027
* [X86][AVX] Regenerated vselect testsSimon Pilgrim2016-02-161-34/+68
| | | | llvm-svn: 261026
* [X86] Remove the now-unused X86ISD::PSIGN. NFC.Ahmed Bougacha2016-02-166-46/+30
| | | | llvm-svn: 261025
* [X86] Generalize logic blend of (x, -x) combine to match (-x, x).Ahmed Bougacha2016-02-162-17/+21
| | | | | | I suspect this is what let PR26110 lie dormant for so long. llvm-svn: 261024
* [X86] Don't turn (c?-v:v) into (c?-v:0) by blindly using PSIGN.Ahmed Bougacha2016-02-162-37/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we sometimes miscompile this vector pattern: (c ? -v : v) We lower it to (because "c" is <4 x i1>, lowered as a vector mask): (~c & v) | (c & -v) When we have SSSE3, we incorrectly lower that to PSIGN, which does: (c < 0 ? -v : c > 0 ? v : 0) in other words, when c is either all-ones or all-zero: (c ? -v : 0) While this is an old bug, it rarely triggers because the PSIGN combine is too sensitive to operand order. This will be improved separately. Note that the PSIGN tests are also incorrect. Consider: %b.lobit = ashr <4 x i32> %b, <i32 31, i32 31, i32 31, i32 31> %sub = sub nsw <4 x i32> zeroinitializer, %a %0 = xor <4 x i32> %b.lobit, <i32 -1, i32 -1, i32 -1, i32 -1> %1 = and <4 x i32> %a, %0 %2 = and <4 x i32> %b.lobit, %sub %cond = or <4 x i32> %1, %2 ret <4 x i32> %cond if %b is zero: %b.lobit = <4 x i32> zeroinitializer %sub = sub nsw <4 x i32> zeroinitializer, %a %0 = <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1> %1 = <4 x i32> %a %2 = <4 x i32> zeroinitializer %cond = or <4 x i32> %a, zeroinitializer ret <4 x i32> %a whereas we currently generate: psignd %xmm1, %xmm0 retq which returns 0, as %xmm1 is 0. Instead, use a pure logic sequence, as described in: https://graphics.stanford.edu/~seander/bithacks.html#ConditionalNegate Fixes PR26110. Differential Revision: http://reviews.llvm.org/D17181 llvm-svn: 261023
* [X86] Extract PSIGN/BLENDVP tests into vector-blend.ll. NFC.Ahmed Bougacha2016-02-163-59/+251
| | | | | | | | | | We're going to stop generating PSIGN, so calling a test "psign" isn't ideal. Instead, call these tests what they really are: variable blends using logic. Also add a test to exhibit a case we're currently missing in the PSIGN combine. llvm-svn: 261022
* [X86] Extract PSIGN/BLENDVP combine. NFC.Ahmed Bougacha2016-02-161-77/+95
| | | | llvm-svn: 261021
* [X86] Extract ANDNP combine. NFC.Ahmed Bougacha2016-02-161-61/+57
| | | | | | This makes it IMO more readable and reduces indentation. llvm-svn: 261020
* Bitcode writer: fix a typo, using getName() instead of getSourceFileName()Mehdi Amini2016-02-161-2/+2
| | | | | | | | When emitting the source filename, the encoding of the string was checked against the name instead of the filename. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 261019
* [WebAssembly] Update torture test expectationsDerek Schuff2016-02-161-7/+0
| | | | | | These were fixed with r260978 llvm-svn: 261017
* [codeview] Bail on a DBG_VALUE register operand with no registerReid Kleckner2016-02-162-6/+10
| | | | | | | | | | This apparently comes up when the register allocator decides that a variable will become undef along a certain path. Also improve the error message we emit when we can't map from LLVM register number to CV register number. llvm-svn: 261016
* [WebAssemly] Don't move calls or stores past intervening loadsDerek Schuff2016-02-162-0/+40
| | | | | | | | | | The register stackifier currently checks for intervening stores (and loads that may alias them) but doesn't account for the fact that the instruction being moved may affect intervening loads. Differential Revision: http://reviews.llvm.org/D17298 llvm-svn: 261014
* [LTO] Support StatisticsAdam Nemet2016-02-162-0/+13
| | | | | | | | | | | | | | | | | | | | | | | Summary: I thought -Xlinker -mllvm -Xlinker -stats worked at some point but maybe it never did. For clang, I believe that stats are printed from cc1_main. This patch also prints them for LTO, specifically right after codegen happens. I only looked at the C API for LTO briefly to see if this is a good place. Probably there are still cases where this wouldn't be printed but it seems to be working for the common case. I also experimented putting this in the LTOCodeGenerator destructor but that didn't trigger for me because ld64 does not destroy the LTOCodeGenerator. Reviewers: dexonsmith, joker.eph Subscribers: rafael, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D17302 llvm-svn: 261013
* [codeview] Fix assertion on non-memory, non-register DBG_VALUE instructionsReid Kleckner2016-02-162-0/+79
| | | | | | | Eventually we should find a way to describe constant variables, but it is not obvious how to do this at the moment. llvm-svn: 261010
* [Hexagon] Adding relocation for code size, cold path optimization allowing a ↵Colin LeMahieu2016-02-1613-1/+77
| | | | | | | | | | | | 23-bit 4-byte aligned relocation to be a valid instruction encoding. The usual way to get a 32-bit relocation is to use a constant extender which doubles the size of the instruction, 4 bytes to 8 bytes. Another way is to put a .word32 and mix code and data within a function. The disadvantage is it's not a valid instruction encoding and jumping over it causes prefetch stalls inside the hardware. This relocation packs a 23-bit value in to an "r0 = add(rX, #a)" instruction by overwriting the source register bits. Since r0 is the return value register, if this instruction is placed after a function call which return void, r0 will be filled with an undefined value, the prefetch won't be confused, and the callee can access the constant value by way of the link register. llvm-svn: 261006
* [AArch64] Add pass to remove redundant copy after RAJun Bum Lim2016-02-165-0/+256
| | | | | | | | | | | | | | | | | | | | | Summary: This change will add a pass to remove unnecessary zero copies in target blocks of cbz/cbnz instructions. E.g., the copy instruction in the code below can be removed because the cbz jumps to BB1 when x0 is zero : BB0: cbz x0, .BB1 BB1: mov x0, xzr Jun Reviewers: gberry, jmolloy, HaoLiu, MatzeB, mcrosier Subscribers: mcrosier, mssimpso, haicheng, bmakam, llvm-commits, aemerson, rengolin Differential Revision: http://reviews.llvm.org/D16203 llvm-svn: 261004
* [GlobalISel] Re-apply r260922-260923 with MSVC-friendly code.Quentin Colombet2016-02-1612-127/+243
| | | | | | | | | Original message: Get rid of the ifdefs in TargetLowering. Introduce a new API used only by GlobalISel: CallLowering. This API will contain target hooks dedicated to call lowering. llvm-svn: 260998
* Pass a std::unique_ptr to IRMover::move.Rafael Espindola2016-02-166-62/+63
| | | | | | | It was already the one "destroying" the source module, now the API reflects that. llvm-svn: 260989
* [WebAssembly] Insert COPY_LOCAL between CopyToReg and FrameIndex DAG nodesDerek Schuff2016-02-164-24/+55
| | | | | | | | | | | | | | CopyToReg nodes don't support FrameIndex operands. Other targets select the FI to some LEA-like instruction, but since we don't have that, we need to insert some kind of instruction that can take an FI operand and produces a value usable by CopyToReg (i.e. in a vreg). So insert a dummy copy_local between Op and its FI operand. This results in a redundant copy which we should optimize away later (maybe in the post-FI-lowering peephole pass). Differential Revision: http://reviews.llvm.org/D17213 llvm-svn: 260987
* [AMDGPU] Rename $dst operand to $vdst for VOP instructions.Tom Stellard2016-02-166-77/+120
| | | | | | | | | | | | | | Summary: This change renames output operand for VOP instructions from dst to vdst. This is needed to enable decoding named operands for disassembler. Reviewers: vpykhtin, tstellarAMD, arsenm Subscribers: arsenm, llvm-commits, nhaustov Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D16920 llvm-svn: 260986
* Revert 260705, it appears to be causing pr26628Philip Reames2016-02-162-76/+0
| | | | | | The root issue appears to be a confusion around what makeNoWrapRegion actually does. It seems likely we need two versions of this function with slightly different semantics. llvm-svn: 260981
* [X86] Enable the LEA optimization pass by default.Andrey Turetskiy2016-02-163-6/+7
| | | | | | Differential Revision: http://reviews.llvm.org/D16877 llvm-svn: 260979
* [WebAssembly] Switch from RPO sorting to topological sorting.Dan Gohman2016-02-162-196/+300
| | | | | | | | | | WebAssembly doesn't require full RPO; topological sorting is sufficient and can preserve more of the MachineBlockPlacement ordering. Unfortunately, this still depends a lot on heuristics, because while we use the MachineBlockPlacement ordering as a guide, we can't use it in places where it isn't topologically ordered. This area will require further attention. llvm-svn: 260978
* A signed bitfield's range is [-1,0], so assigning 1 is technically an ↵Aaron Ballman2016-02-161-1/+1
| | | | | | overflow. However, the other bitfield requires a signed value (it supports negative offsets), so it is slightly better to retain a signed 1-bit bitfield and use -1 instead of 1. Silences an MSVC warning. llvm-svn: 260973
* Reverting r260922-260923; they cause link failures with MSVC.Aaron Ballman2016-02-1612-240/+125
| | | | | | | http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc2015/builds/15436/steps/build/logs/stdio http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc18-DA/builds/961/steps/build_llvm/logs/stdio llvm-svn: 260972
* [WebAssembly] Create new registers instead of reusing old ones in RegStackify.Dan Gohman2016-02-163-37/+82
| | | | | | | | This avoids some complications updating LiveIntervals to be aware of the new register lifetimes, because we can just compute new intervals from scratch rather than describe how the old ones have been changed. llvm-svn: 260971
* Reapply r260489.Rafael Espindola2016-02-162-0/+41
| | | | | | | | | | Original commit message: [readobj] Dump DT_JMPREL relocations when outputting dynamic relocations. The bits of r260488 it depends on have been committed. llvm-svn: 260970
* [WebAssembly] Implement support for custom NaN bit patterns.Dan Gohman2016-02-165-14/+71
| | | | llvm-svn: 260968
* Introduce a getAsRange helper.Rafael Espindola2016-02-162-26/+10
| | | | | | | | | This requires making an error message a bit more generic, but that seems a reasonable tradeoff. Extracted from r260488 but simplified a bit. llvm-svn: 260967
* Move DynRegionInfo out of the ELFDumper.Rafael Espindola2016-02-161-11/+11
| | | | | | | | This reduces indentation in preparation to adding a bit more code to it. Extracted from r260488. llvm-svn: 260963
* This reverts commit r260488 and r260489.Rafael Espindola2016-02-1610-442/+110
| | | | | | | | | | | Original messages: Revert "[readobj] Handle ELF files with no section table or with no program headers." Revert "[readobj] Dump DT_JMPREL relocations when outputting dynamic relocations." r260489 depends on r260488 and among other issues r260488 deleted error handling code. llvm-svn: 260962
* [X86] PR26575: Fix LEA optimization pass.Andrey Turetskiy2016-02-162-0/+37
| | | | | | | | | | Add a missing check for a type of address displacement operand of the load/store instruction being a candidate for LEA substitution. Ref: https://llvm.org/bugs/show_bug.cgi?id=26575 Differential Revision: http://reviews.llvm.org/D17261 llvm-svn: 260959
* [Hexagon] Hoist nonnull assert up.Benjamin Kramer2016-02-162-1/+1
| | | | | | | | Once a pointer is turned into a reference it cannot be nullptr, clang rightfully warns about this assert being a tautology. Put the assert before the reference is created. llvm-svn: 260949
* Make sure the functions' range is empty before going through it in the LLVM ↵Amaury Sechet2016-02-162-0/+8
| | | | | | C API test llvm-svn: 260947
* [X86] Fix typos. NFCCraig Topper2016-02-161-2/+2
| | | | llvm-svn: 260943
* [X86] Use range-based for loop. NFCCraig Topper2016-02-161-3/+2
| | | | llvm-svn: 260942
* Do some refactoring in constant generation in the C API echo test. NFCAmaury Sechet2016-02-161-8/+10
| | | | llvm-svn: 260941
* [X86] Fix typo in comment. NFCCraig Topper2016-02-161-1/+1
| | | | llvm-svn: 260940
* Generate functions in 2 steps in the C API echo test. NFCAmaury Sechet2016-02-161-6/+32
| | | | llvm-svn: 260939
* [SCEVExpander] Make findExistingExpansion smarterJunmo Park2016-02-164-28/+77
| | | | | | | | | | | | | Summary: Extending findExistingExpansion can use existing value in ExprValueMap. This patch gives 0.3~0.5% performance improvements on benchmarks(test-suite, spec2000, spec2006, commercial benchmark) Reviewers: mzolotukhin, sanjoy, zzheng Differential Revision: http://reviews.llvm.org/D15559 llvm-svn: 260938
* Restore the capability to manipulate datalayout from the C APIAmaury Sechet2016-02-166-0/+38
| | | | | | | | | | | | | | | | | Summary: This consist in variosu addition to the C API: LLVMTargetDataRef LLVMGetModuleDataLayout(LLVMModuleRef M); void LLVMSetModuleDataLayout(LLVMModuleRef M, LLVMTargetDataRef DL); LLVMTargetDataRef LLVMCreateTargetMachineData(LLVMTargetMachineRef T); Reviewers: joker.eph, Wallbraker, echristo Subscribers: axw Differential Revision: http://reviews.llvm.org/D17255 llvm-svn: 260936
* [TableGen] Fix inconsistent spacing. NFCCraig Topper2016-02-161-2/+2
| | | | llvm-svn: 260935
OpenPOWER on IntegriCloud