summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Fix formatting. NFCCraig Topper2016-11-171-2/+2
| | | | llvm-svn: 287211
* [XRay] Support AArch64 in LLVMDean Michael Berris2016-11-173-1/+126
| | | | | | | | | | | | | | | | | | This patch adds XRay support in LLVM for AArch64 targets. This patch is one of a series: Clang: https://reviews.llvm.org/D26415 compiler-rt: https://reviews.llvm.org/D26413 Author: rSerge Reviewers: rengolin, dberris Subscribers: amehsan, aemerson, llvm-commits, iid_iunknown Differential Revision: https://reviews.llvm.org/D26412 llvm-svn: 287209
* [CMake] NFC. Updating CMake dependency specificationsChris Bieneman2016-11-1722-44/+66
| | | | | | This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system. llvm-svn: 287206
* [AMDGPU] Custom lower f16 = fp_round f64Konstantin Zhuravlyov2016-11-172-0/+23
| | | | llvm-svn: 287203
* [AMDGPU] Promote f16/i16 conversions to f32/i32Konstantin Zhuravlyov2016-11-172-58/+8
| | | | llvm-svn: 287201
* [AMDGPU] Expand `br_cc` for f16Konstantin Zhuravlyov2016-11-171-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D26732 llvm-svn: 287199
* Use profile info to adjust loop unroll threshold.Dehao Chen2016-11-172-0/+52
| | | | | | | | | | | | | | Summary: For flat loop, even if it is hot, it is not a good idea to unroll in runtime, thus we set a lower partial unroll threshold. For hot loop, we set a higher unroll threshold and allows expensive tripcount computation to allow more aggressive unrolling. Reviewers: davidxl, mzolotukhin Subscribers: sanjoy, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D26527 llvm-svn: 287186
* Introduce GlobalSplit pass.Peter Collingbourne2016-11-164-0/+171
| | | | | | | | | This pass splits globals into elements using inrange annotations on getelementptr indices. Differential Revision: https://reviews.llvm.org/D22295 llvm-svn: 287178
* [AVR] Wrap all methods in the pseudo expansion pass in an anon namespaceDylan McKay2016-11-161-2/+2
| | | | | | | The '-fpermissive' compiler flag complains if the template specializations used in the class are used in a different namespace. llvm-svn: 287176
* [AVR] Remove unused method from AVRTargetMachineDylan McKay2016-11-161-3/+0
| | | | llvm-svn: 287173
* [x86] allow FP-logic ops when one operand is FP and result is FPSanjay Patel2016-11-161-14/+26
| | | | | | | | | | | | | | We save an inter-register file move this way. If there's any CPU where the FP logic is slower, we could transform this back to int-logic in MachineCombiner. This helps, but doesn't solve, PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137 The 'andn' test shows that we're missing a pattern match to recognize the xor with -1 constant as a 'not' op. llvm-svn: 287171
* [AsmParser] Avoid recursing when lexing ';'. NFC.Ahmed Bougacha2016-11-161-52/+54
| | | | | | | | | This should prevent stack overflows in non-optimized builds on .ll files with lots of consecutive commented-out lines. Instead of recursing into LexToken(), continue into a 'while (true)'. llvm-svn: 287170
* [CodeGen] Pass references, not pointers, to MMI helpers. NFC.Ahmed Bougacha2016-11-164-13/+13
| | | | | | While there, rename them to follow the coding style. llvm-svn: 287169
* Revert "Get GlobalISel to build on Linux after r286407"Ahmed Bougacha2016-11-161-1/+1
| | | | | | | | | This reverts commit r286962. We want to avoid depending on SelectionDAG, and AddLandingPadInfo lives in CodeGen now. llvm-svn: 287168
* [CodeGen] Pull MMI helpers from FunctionLoweringInfo to MMI. NFC.Ahmed Bougacha2016-11-163-57/+51
| | | | | | | | | | | They're not SelectionDAG- or FunctionLoweringInfo-specific. They are, however, specific to building MMI from IR. We could make them members, but it's nice having MMI be a "simple" data structure and this logic kept separate. This also lets us reuse them from GlobalISel. llvm-svn: 287167
* [CodeGen] Cleanup MachineModuleInfo doxygen comments. NFC.Ahmed Bougacha2016-11-161-39/+7
| | | | | | Remove redundant names and only keep header comments. llvm-svn: 287166
* [AVR] Add the pseudo instruction expansion passDylan McKay2016-11-163-1/+1433
| | | | | | | | | | | | | | | | | | Summary: A lot of the pseudo instructions are required because LLVM assumes that all integers of the same size as the pointer size are legal. This means that it will not currently expand 16-bit instructions to their 8-bit variants because it thinks 16-bit types are legal for the operations. This also adds all of the CodeGen tests that required the pass to run. Reviewers: arsenm, kparzysz Subscribers: wdng, mgorny, modocache, llvm-commits Differential Revision: https://reviews.llvm.org/D26577 llvm-svn: 287162
* X86: Simplify X86ISD::Wrapper operand checks. NFCI.Peter Collingbourne2016-11-162-18/+8
| | | | | | | | | | | | | We only ever create TargetConstantPool, TargetJumpTable, TargetExternalSymbol, TargetGlobalAddress, TargetGlobalTLSAddress, MCSymbol and TargetBlockAddress nodes as operands of X86ISD::Wrapper nodes, so we can remove one check and invert the other. Also update the documentation comment for X86ISD::Wrapper. Differential Revision: https://reviews.llvm.org/D26731 llvm-svn: 287160
* [ImplicitNullChecks] Do not not handle call MachineInstrsSanjoy Das2016-11-161-1/+4
| | | | | | | | | | | We don't track callee clobbered registers correctly, so avoid hoisting across calls. Note: for this bug to trigger we need a `readonly` call target, since we already have logic to not hoist across potentially storing instructions either. llvm-svn: 287159
* Bitcode: Introduce initial multi-module reader API.Peter Collingbourne2016-11-161-49/+99
| | | | | | | | Implement getLazyBitcodeModule() and parseBitcodeFile() in terms of it. Differential Revision: https://reviews.llvm.org/D26719 llvm-svn: 287156
* ARM: fix CodeGen for 64-bit shifts.Tim Northover2016-11-161-17/+31
| | | | | | | | | One half of the shifts obviously needed conditional selection based on whether the shift amount is more than 32-bits, but leaving the other half as the natural shift isn't acceptable either: it's undefined behaviour to shift a 32-bit value by more than 31. llvm-svn: 287149
* Make block placement deterministicRong Xu2016-11-161-3/+3
| | | | | | | | | | | | We fail to produce bit-to-bit matching stage2 and stage3 compiler in PGO bootstrap build. The reason is because LoopBlockSet is of SmallPtrSet type whose iterating order depends on the pointer value. This patch fixes this issue by changing to use SmallSetVector. Differential Revision: http://reviews.llvm.org/D26634 llvm-svn: 287148
* [InstCombine] replace unreachable with assert and remove unreachable code; NFCISanjay Patel2016-11-161-20/+9
| | | | llvm-svn: 287147
* AMDGPU: Enable ConstrainCopy DAG mutationMatt Arsenault2016-11-161-0/+3
| | | | | | | This fixes a probably unintended divergence from the default scheduler behavior. llvm-svn: 287146
* [InstCombine] fix formatting and add FIXMEs to ↵Sanjay Patel2016-11-161-11/+15
| | | | | | foldOperationIntoSelectOperand(); NFC llvm-svn: 287145
* [AArch64] Handle vector types in replaceZeroVectorStore.Geoff Berry2016-11-161-20/+22
| | | | | | | | | | | | | | | | | | | Summary: Extend replaceZeroVectorStore to handle more vector type stores, floating point zero vectors and set alignment more accurately on split stores. This is a follow-up change to r286875. This change fixes PR31038. Reviewers: MatzeB Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D26682 llvm-svn: 287142
* [LoopVectorize] Fix for non-determinism in codegenMandeep Singh Grang2016-11-161-1/+1
| | | | | | | | | | | | Summary: This patch fixes issues in codegen uncovered due to https://reviews.llvm.org/D26718 Reviewers: mssimpso Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26727 llvm-svn: 287135
* AMDGPU/SI: Avoid creating unnecessary copies in the SIFixSGPRCopies passTom Stellard2016-11-164-26/+78
| | | | | | | | | | | | | | | | | | | | | | Summary: 1. Don't try to copy values to and from the same register class. 2. Replace copies with of registers with immediate values with v_mov/s_mov instructions. The main purpose of this change is to make MachineSink do a better job of determining when it is beneficial to split a critical edge, since the pass assumes that copies will become move instructions. This prevents a regression in uniform-cfg.ll if we enable critical edge splitting for AMDGPU. Reviewers: arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D23408 llvm-svn: 287131
* fix comment formatting; NFCSanjay Patel2016-11-161-8/+4
| | | | llvm-svn: 287127
* [x86] add fake scalar FP logic instructions to ReplaceableInstrs to save ↵Sanjay Patel2016-11-161-0/+8
| | | | | | | | | | | | | | | | | | | some bytes We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions. Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of compilers, but logically equivalent int, float, and double variants of bitwise-logic instructions are reality in x86, and the float variant may be a shorter instruction depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all the time. This is a preliminary step towards solving PR6137: https://llvm.org/bugs/show_bug.cgi?id=6137 Differential Revision: https://reviews.llvm.org/D26712 llvm-svn: 287122
* [sancov] Name the global containing the main source file nameReid Kleckner2016-11-161-3/+3
| | | | | | | If the global name doesn't start with __sancov_gen, ASan will insert unecessary red zones around it. llvm-svn: 287117
* test commit, changed tab to spaces, NFCDaniil Fukalov2016-11-161-1/+1
| | | | llvm-svn: 287116
* Add a little endian variant of TCE.Pekka Jaaskelainen2016-11-161-1/+10
| | | | llvm-svn: 287111
* [X86][AVX512] Autoupgrade lossless i32/u32 to f64 conversion intrinsics with ↵Simon Pilgrim2016-11-163-21/+29
| | | | | | | | | | | | generic IR Both the (V)CVTDQ2PD (i32 to f64) and (V)CVTUDQ2PD (u32 to f64) conversion instructions are lossless and can be safely represented as generic SINT_TO_FP/UINT_TO_FP calls instead of x86 intrinsics without affecting final codegen. LLVM counterpart to D26686 Differential Revision: https://reviews.llvm.org/D26736 llvm-svn: 287108
* [mips] Fix unsigned/signed type errorSimon Dardis2016-11-161-3/+3
| | | | | | | | | | | | | | | MipsFastISel uses a a class to represent addresses with a signed member to represent the offset. MipsFastISel::emitStore, emitLoad and computeAddress all treated the offset as being positive. In cases where the offset was actually negative and a frame pointer was used, this would cause the constant synthesis routine to crash as it would generate an unexpected instruction sequence when frame indexes are replaced. Reviewers: vkalintiris Differential Revision: https://reviews.llvm.org/D26192 llvm-svn: 287099
* [mips] not instruction aliasSimon Dardis2016-11-162-0/+5
| | | | | | | | | | | This patch adds the single operand form of the not alias to microMIPS and MIPS along with additional tests. This partially resolves PR/30381. Thanks to Sean Bruno for reporting the issue! llvm-svn: 287097
* Remove TimeValue classPavel Labath2016-11-162-56/+0
| | | | | | | | | | | | | | Summary: All uses have been replaced by appropriate std::chrono types, and the class is now unused. Reviewers: zturner, mehdi_amini Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D26447 llvm-svn: 287094
* [X86][AVX512] Removing llvm x86 intrinsics for _mm_mask_move_{ss|sd} intrinsics.Ayman Musa2016-11-162-4/+16
| | | | | | Differential Revision: https://reviews.llvm.org/D26128 llvm-svn: 287087
* [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmulCraig Topper2016-11-164-97/+75
| | | | | | | | | | | | Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file. Reviewers: zvi, delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26660 llvm-svn: 287083
* [AMDGPU] Refactor v_mac_{f16, f32} patterns into a class NFCKonstantin Zhuravlyov2016-11-161-23/+18
| | | | | | Differential Revision: https://reviews.llvm.org/D26711 llvm-svn: 287077
* AArch64: Use DeadRegisterDefinitionsPass before regalloc.Matthias Braun2016-11-162-33/+26
| | | | | | | | | Doing this before register allocation reduces register pressure as we do not even have to allocate a register for those dead definitions. Differential Revision: https://reviews.llvm.org/D26111 llvm-svn: 287076
* [AMDGPU] Handle f16 select{_cc}Konstantin Zhuravlyov2016-11-163-15/+13
| | | | | | | | | | - Select `select` to `v_cndmask_b32` - Expand `select_cc` - Refactor patterns Differential Revision: https://reviews.llvm.org/D26714 llvm-svn: 287074
* [RegAllocGreedy] Record missed hint for late recoloring.Quentin Colombet2016-11-161-0/+3
| | | | | | | | | | | | | | | In https://reviews.llvm.org/D25347, Geoff noticed that we still have useless copy that we can eliminate after register allocation. At the time the allocation is chosen for those copies, they are not useless but, because of changes in the surrounding code, later on they might become useless. The Greedy allocator already has a mechanism to deal with such cases with a late recoloring. However, we missed to record the some of the missed hints. This commit fixes that. llvm-svn: 287070
* Align Modi and FileInfo substreams on 32-byte offsets.Rui Ueyama2016-11-161-4/+4
| | | | | | | | | | This is required by DbiStream, but DbiStreamBuilder didn't align these substreams, so the output of DbiSTreamBuilder couldn't be read by DbiStream. Test will be added to LLD. llvm-svn: 287067
* Fixed the lost FastMathFlags for CALL operations in SLPVectorizer.Vyacheslav Klochkov2016-11-161-0/+1
| | | | | | | Reviewer: Michael Zolotukhin. Differential Revision: https://reviews.llvm.org/D26575 llvm-svn: 287064
* [BypassSlowDivision] Handle division by constant numerators better.Justin Lebar2016-11-161-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We don't do BypassSlowDivision when the denominator is a constant, but we do do it when the numerator is a constant. This patch makes two related changes to BypassSlowDivision when the numerator is a constant: * If the numerator is too large to fit into the bypass width, don't bypass slow division (because we'll never run the smaller-width code). * If we bypass slow division where the numerator is a constant, don't OR together the numerator and denominator when determining whether both operands fit within the bypass width. We need to check only the denominator. Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D26699 llvm-svn: 287062
* [BypassSlowDivision] Simplify partially-tautological if statement.Justin Lebar2016-11-161-4/+3
| | | | | | if (A || (B && A)) --> if (A). llvm-svn: 287061
* Fix Modi and File count if there are more than 65535 modules/files.Rui Ueyama2016-11-161-2/+2
| | | | | | | | These numbers are intended to be capped at 65535, but `std::max<uint16_t>(UINT16_MAX, N)` always returns N for any N because the expression is the same as `std::max((uint16_t)UINT16_MAX, (uint16_t)N)`. llvm-svn: 287060
* Always use relative jump table encodings on PowerPC64.Joerg Sonnenberger2016-11-162-0/+59
| | | | | | | | | | | | | | | | | For the default, small and medium code model, use the existing difference from the jump table towards the label. For all other code models, setup the picbase and use the difference between the picbase and the block address. Overall, this results in smaller data tables at the expensive of one or two more arithmetic operation at the jump site. Given that we only create jump tables with a lot more than two entries, it is a net win in size. For larger code models the assumption remains that individual functions are no larger than 2GB. Differential Revision: https://reviews.llvm.org/D26336 llvm-svn: 287059
* AMDGPU/GCN: Exit early in hazard recognizer if there is no vreg argumentJan Vesely2016-11-151-0/+4
| | | | | | | | | | wbinvl.* are vector instruction that do not sue vector registers. v2: check only M?BUF instructions Differential Revision: https://reviews.llvm.org/D26633 llvm-svn: 287056
OpenPOWER on IntegriCloud