summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SimplifyCFG] Be even more conservative in SinkThenElseCodeToEndJames Molloy2016-09-111-15/+19
| | | | | | | | This should *actually* fix PR30244. This cranks up the workaround for PR30188 so that we never sink loads or stores of allocas. The idea is that these should be removed by SROA/Mem2Reg, and any movement of them may well confuse SROA or just cause unwanted code churn. It's not ideal that the midend should be crippled like this, but that unwanted churn can really cause significant regressions in important workloads (tsan). llvm-svn: 281162
* [SimplifyCFG] Harden up the profitability heuristic for block splitting ↵James Molloy2016-09-111-5/+20
| | | | | | | | | | | | during sinking Exposed by PR30244, we will split a block currently if we think we can sink at least one instruction. However this isn't right - the reason we split predecessors is so that we can sink instructions that otherwise couldn't be sunk because it isn't safe to do so - stores, for example. So, change the heuristic to only split if it thinks it can sink at least one non-speculatable instruction. Should fix PR30244. llvm-svn: 281160
* [CodeGen] Make the TwoAddressInstructionPass check if the instruction is ↵Craig Topper2016-09-111-1/+4
| | | | | | commutable before calling findCommutedOpIndices for every operand. Also make sure the operand is a register before each call to save some work on commutable instructions that might have an operand. llvm-svn: 281158
* [AVX-512] Add VPTERNLOG to load folding tables.Craig Topper2016-09-111-0/+18
| | | | llvm-svn: 281156
* [X86] Make a helper method into a static function local to the cpp file.Craig Topper2016-09-112-11/+10
| | | | llvm-svn: 281154
* Add handling of !invariant.load to PropagateMetadata.Justin Lebar2016-09-111-6/+6
| | | | | | | | | | | | | | Summary: This will let e.g. the load/store vectorizer propagate this metadata appropriately. Reviewers: arsenm Subscribers: tra, jholewinski, hfinkel, mzolotukhin Differential Revision: https://reviews.llvm.org/D23479 llvm-svn: 281153
* [NVPTX] Use ldg for explicitly invariant loads.Justin Lebar2016-09-111-13/+22
| | | | | | | | | | | | | | | | | | Summary: With this change (plus some changes to prevent !invariant from being clobbered within llvm), clang will be able to model the __ldg CUDA builtin as an invariant load, rather than as a target-specific llvm intrinsic. This will let the optimizer play with these loads -- specifically, we should be able to vectorize them in the load-store vectorizer. Reviewers: tra Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc Differential Revision: https://reviews.llvm.org/D23477 llvm-svn: 281152
* [CodeGen] Split out the notions of MI invariance and MI dereferenceability.Justin Lebar2016-09-1122-62/+101
| | | | | | | | | | | | | | | | | | | Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151
* It should also be legal to pass a swifterror parameter to a call as a swifterrorArnold Schwaighofer2016-09-101-4/+9
| | | | | | | | argument. rdar://28233388 llvm-svn: 281147
* InstCombine: Don't combine loads/stores from swifterror to a new typeArnold Schwaighofer2016-09-101-0/+8
| | | | | | | | | This generates invalid IR: the only users of swifterror can be call arguments, loads, and stores. rdar://28242257 llvm-svn: 281144
* Add an isSwiftError predicate to ValueArnold Schwaighofer2016-09-101-0/+10
| | | | llvm-svn: 281143
* [InstCombine] clean up foldICmpBinOpEqualityWithConstant / ↵Sanjay Patel2016-09-101-59/+56
| | | | | | | | | foldICmpIntrinsicWithConstant ; NFC 1. Rename variables to be consistent with related/preceding code (may want to reorganize). 2. Fix comments/formatting. llvm-svn: 281140
* [InstCombine] rename and reorganize some icmp folding functions; NFCSanjay Patel2016-09-102-24/+23
| | | | | | | | | | Everything under foldICmpInstWithConstant() should now be working for splat vectors via m_APInt matchers. Ie, I've removed all of the FIXMEs that I added while cleaning that section up. Note that not all of the associated FIXMEs in the regression tests are gone though, because some of the tests require earlier folds that are still scalar-only. llvm-svn: 281139
* We also need to pass swifterror in R12 under swiftcc not only under cccArnold Schwaighofer2016-09-101-0/+3
| | | | | | rdar://28190687 llvm-svn: 281138
* [AMDGPU] Refactor MUBUF/MTBUF instructionsValery Pykhtin2016-09-106-1168/+1306
| | | | | | Differential revision: https://reviews.llvm.org/D24295 llvm-svn: 281137
* [WebAssembly] Fix typos in commentsHeejin Ahn2016-09-101-11/+14
| | | | llvm-svn: 281131
* [libFuzzer] print a failed-merge warning only in the merge modeKostya Serebryany2016-09-101-0/+1
| | | | llvm-svn: 281130
* AMDGPU: Implement is{LoadFrom|StoreTo}FrameIndexMatt Arsenault2016-09-106-21/+90
| | | | llvm-svn: 281128
* AMDGPU: Fix scheduling info for spill pseudosMatt Arsenault2016-09-101-2/+3
| | | | | | | These defaulted to Write32Bit. I don't think this actually matters since these don't exist during scheduling. llvm-svn: 281127
* [asan] Add flag to allow lifetime analysis of problematic allocasVitaly Buka2016-09-101-0/+6
| | | | | | | | | | | | | | Summary: Could be useful for comparison when we suspect that alloca was skipped because of this. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24437 llvm-svn: 281126
* [CodeGen] Rename MachineInstr::isInvariantLoad to ↵Justin Lebar2016-09-108-16/+16
| | | | | | | | | | | | | | | | | | | | isDereferenceableInvariantLoad. NFC Summary: I want to separate out the notions of invariance and dereferenceability at the MI level, so that they correspond to the equivalent concepts at the IR level. (Currently an MI load is MI-invariant iff it's IR-invariant and IR-dereferenceable.) First step is renaming this function. Reviewers: chandlerc Subscribers: MatzeB, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D23370 llvm-svn: 281125
* [libFuzzer] don't print help for internal flags Kostya Serebryany2016-09-102-0/+3
| | | | llvm-svn: 281124
* [libFuzzer] print a visible message if merge fails due to a crash Kostya Serebryany2016-09-103-0/+24
| | | | llvm-svn: 281122
* AMDGPU: Fix immediate folding logic when shrinking instructionsMatt Arsenault2016-09-093-16/+10
| | | | | | | | | | If the literal is being folded into src0, it doesn't matter if it's an SGPR because it's being replaced with the literal. Also fixes initially selecting 32-bit versions of some instructions which also confused commuting. llvm-svn: 281117
* Inliner: Don't mark swifterror allocas with lifetime markersArnold Schwaighofer2016-09-091-0/+3
| | | | | | | | | This would create a bitcast use which fails the verifier: swifterror values may only be used by loads, stores, and as function arguments. rdar://28233244 llvm-svn: 281114
* X86: Fold tail calls into conditional branches also for 64-bit (PR26302)Hans Wennborg2016-09-094-12/+40
| | | | | | | | | This extends the optimization in r280832 to also work for 64-bit. The only quirk is that we can't do this for 64-bit Windows (yet). Differential Revision: https://reviews.llvm.org/D24423 llvm-svn: 281113
* AMDGPU: Run LoadStoreVectorizer pass by defaultMatt Arsenault2016-09-092-1/+4
| | | | llvm-svn: 281112
* [libFuzzer] use sizeof() in tests instead of 4 and 8Kostya Serebryany2016-09-092-6/+6
| | | | llvm-svn: 281111
* LSV: Fix incorrectly increasing alignmentMatt Arsenault2016-09-091-18/+16
| | | | | | | If the unaligned access has a dynamic offset, it may be odd which would make the adjusted alignment incorrect to use. llvm-svn: 281110
* [InstCombine] use m_APInt to allow icmp ult X, C folds for splat constant ↵Sanjay Patel2016-09-091-8/+13
| | | | | | vectors llvm-svn: 281107
* [libFuzzer] one more puzzle for value profileKostya Serebryany2016-09-093-0/+25
| | | | llvm-svn: 281106
* [X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targetsSimon Pilgrim2016-09-091-5/+5
| | | | | | Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets llvm-svn: 281105
* [Hexagon] Fix disassembler crash after r279255Krzysztof Parzyszek2016-09-091-0/+3
| | | | | | | When p0 was added as an explicit operand to the duplex subinstructions, the disassembler was not updated to reflect this. llvm-svn: 281104
* Create phi nodes for swifterror values at the end of the phi instructions listArnold Schwaighofer2016-09-091-1/+1
| | | | | | | | ISel makes assumption about the order of phi nodes. rdar://28190150 llvm-svn: 281095
* [NVPTX] Implement llvm.fabs.f32, llvm.max.f32, etc.Justin Lebar2016-09-092-16/+132
| | | | | | | | | | | | | | | | | | | | Summary: Previously these only worked via NVPTX-specific intrinsics. This change will allow us to convert these target-specific intrinsics into the general LLVM versions, allowing existing LLVM passes to reason about their behavior. It also gets us some minor codegen improvements as-is, from situations where we canonicalize code into one of these llvm intrinsics. Reviewers: majnemer Subscribers: llvm-commits, jholewinski, tra Differential Revision: https://reviews.llvm.org/D24300 llvm-svn: 281092
* ARM: move the builtins libcall CC setupSaleem Abdulrasool2016-09-093-166/+171
| | | | | | | | | Move the target specific setup into the target specific lowering setup. As pointed out by Anton, the initial change was moving this too high up the stack resulting in a violation of the layering (the target generic code path setup target specific bits). Sink this into the ARM specific setup. NFC. llvm-svn: 281088
* Add a lower level zlib::uncompress.Rafael Espindola2016-09-091-6/+13
| | | | | | | | | | | SmallVectors are convenient, but they don't cover every use case. In particular, they are fairly large (3 pointers + one element) and there is no way to take ownership of the buffer to put it somewhere else. This patch then adds a lower lever interface that works with any buffer. llvm-svn: 281082
* AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type.Wei Ding2016-09-093-9/+17
| | | | | | Differential Revision: http://reviews.llvm.org/D23700 llvm-svn: 281081
* AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSATom Stellard2016-09-092-1/+6
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24405 llvm-svn: 281080
* [pdb] Print out some more info when dumping a raw stream.Zachary Turner2016-09-091-0/+4
| | | | | | | | | | | | | | | We have various command line options that print the type of a stream, the size of a stream, etc but nowhere that it can all be viewed together. Since a previous patch introduced the ability to dump the bytes of a stream, this seems like a good place to present a full view of the stream's properties including its size, what kind of data it represents, and the blocks it occupies. So I added the ability to print that information to the -stream-data command line option. llvm-svn: 281077
* Do not widen load for different variable in GVN.Dehao Chen2016-09-091-37/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Widening load in GVN is too early because it will block other optimizations like PRE, LICM. https://llvm.org/bugs/show_bug.cgi?id=29110 The SPECCPU2006 benchmark impact of this patch: Reference: o2_nopatch (1): o2_patched Benchmark Base:Reference (1) ------------------------------------------------------- spec/2006/fp/C++/444.namd 25.2 -0.08% spec/2006/fp/C++/447.dealII 45.92 +1.05% spec/2006/fp/C++/450.soplex 41.7 -0.26% spec/2006/fp/C++/453.povray 35.65 +1.68% spec/2006/fp/C/433.milc 23.79 +0.42% spec/2006/fp/C/470.lbm 41.88 -1.12% spec/2006/fp/C/482.sphinx3 47.94 +1.67% spec/2006/int/C++/471.omnetpp 22.46 -0.36% spec/2006/int/C++/473.astar 21.19 +0.24% spec/2006/int/C++/483.xalancbmk 36.09 -0.11% spec/2006/int/C/400.perlbench 33.28 +1.35% spec/2006/int/C/401.bzip2 22.76 -0.04% spec/2006/int/C/403.gcc 32.36 +0.12% spec/2006/int/C/429.mcf 41.04 -0.41% spec/2006/int/C/445.gobmk 26.94 +0.04% spec/2006/int/C/456.hmmer 24.5 -0.20% spec/2006/int/C/458.sjeng 28 -0.46% spec/2006/int/C/462.libquantum 55.25 +0.27% spec/2006/int/C/464.h264ref 45.87 +0.72% geometric mean +0.23% For most benchmarks, it's a wash, but we do see stable improvements on some benchmarks, e.g. 447,453,482,400. Reviewers: davidxl, hfinkel, dberlin, sanjoy, reames Subscribers: gberry, junbuml Differential Revision: https://reviews.llvm.org/D24096 llvm-svn: 281074
* Fix another -Wunused-variable for non-assert build.Rui Ueyama2016-09-091-3/+4
| | | | llvm-svn: 281073
* Fix -Wunused-variable for non-assert build.Rui Ueyama2016-09-091-3/+2
| | | | llvm-svn: 281069
* [pdb] Pass CVRecord's through the visitor as non-const references.Zachary Turner2016-09-095-85/+85
| | | | | | | | | | | | | | | | | | This simplifies a lot of code, and will actually be necessary for an upcoming patch to serialize TPI record hash values. The idea before was that visitors should be examining records, not modifying them. But this is no longer true with a visitor that constructs a CVRecord from Yaml. To handle this until now, we were doing some fixups on CVRecord objects at a higher level, but the code is really awkward, and it makes sense to just have the visitor write the bytes into the CVRecord. In doing so I uncovered a few bugs related to `Data` and `RawData` and fixed those. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24362 llvm-svn: 281067
* [libFuzzer] one more puzzle, value_profile cracks it in a secondKostya Serebryany2016-09-093-0/+25
| | | | llvm-svn: 281066
* [pdb] Write PDB TPI Stream from Yaml.Zachary Turner2016-09-099-74/+177
| | | | | | | | | | This writes the full sequence of type records described in Yaml to the TPI stream of the PDB file. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D24316 llvm-svn: 281063
* [codeview] Don't assert if the array element type is incompleteReid Kleckner2016-09-091-15/+26
| | | | | | | | | This can happen when the frontend knows the debug info will be emitted somewhere else. Usually this happens for dynamic classes with out of line constructors or key functions, but it can also happen when modules are enabled. llvm-svn: 281060
* AMDGPU] Assembler: better support for immediate literals in assembler.Sam Kolton2016-09-0914-351/+708
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Prevously assembler parsed all literals as either 32-bit integers or 32-bit floating-point values. Because of this we couldn't support f64 literals. E.g. in instruction "v_fract_f64 v[0:1], 0.5", literal 0.5 was encoded as 32-bit literal 0x3f000000, which is incorrect and will be interpreted as 3.0517578125E-5 instead of 0.5. Correct encoding is inline constant 240 (optimal) or 32-bit literal 0x3FE00000 at least. With this change the way immediate literals are parsed is changed. All literals are always parsed as 64-bit values either integer or floating-point. Then we convert parsed literals to correct form based on information about type of operand parsed (was literal floating or binary) and type of expected instruction operands (is this f32/64 or b32/64 instruction). Here are rules how we convert literals: - We parsed fp literal: - Instruction expects 64-bit operand: - If parsed literal is inlinable (e.g. v_fract_f64_e32 v[0:1], 0.5) - then we do nothing this literal - Else if literal is not-inlinable but instruction requires to inline it (e.g. this is e64 encoding, v_fract_f64_e64 v[0:1], 1.5) - report error - Else literal is not-inlinable but we can encode it as additional 32-bit literal constant - If instruction expect fp operand type (f64) - Check if low 32 bits of literal are zeroes (e.g. v_fract_f64 v[0:1], 1.5) - If so then do nothing - Else (e.g. v_fract_f64 v[0:1], 3.1415) - report warning that low 32 bits will be set to zeroes and precision will be lost - set low 32 bits of literal to zeroes - Instruction expects integer operand type (e.g. s_mov_b64_e32 s[0:1], 1.5) - report error as it is unclear how to encode this literal - Instruction expects 32-bit operand: - Convert parsed 64 bit fp literal to 32 bit fp. Allow lose of precision but not overflow or underflow - Is this literal inlinable and are we required to inline literal (e.g. v_trunc_f32_e64 v0, 0.5) - do nothing - Else report error - Do nothing. We can encode any other 32-bit fp literal (e.g. v_trunc_f32 v0, 10000000.0) - Parsed binary literal: - Is this literal inlinable (e.g. v_trunc_f32_e32 v0, 35) - do nothing - Else, are we required to inline this literal (e.g. v_trunc_f32_e64 v0, 35) - report error - Else, literal is not-inlinable and we are not required to inline it - Are high 32 bit of literal zeroes or same as sign bit (32 bit) - do nothing (e.g. v_trunc_f32 v0, 0xdeadbeef) - Else - report error (e.g. v_trunc_f32 v0, 0x123456789abcdef0) For this change it is required that we know operand types of instruction (are they f32/64 or b32/64). I added several new register operands (they extend previous register operands) and set operand types to corresponding types: ''' enum OperandType { OPERAND_REG_IMM32_INT, OPERAND_REG_IMM32_FP, OPERAND_REG_INLINE_C_INT, OPERAND_REG_INLINE_C_FP, } ''' This is not working yet: - Several tests are failing - Problems with predicate methods for inline immediates - LLVM generated assembler parts try to select e64 encoding before e32. More changes are required for several AsmOperands. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, artem.tamazov Differential Revision: https://reviews.llvm.org/D22922 llvm-svn: 281050
* [Sparc][LEON] Removed the parts of the errata fixes implemented using inline ↵Chris Dewhurst2016-09-091-76/+0
| | | | | | assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly. llvm-svn: 281047
* [ARM] ADD with a negative offset can become SUB for freeJames Molloy2016-09-091-0/+4
| | | | | | So model that directly in TTI::getIntImmCost(). llvm-svn: 281044
OpenPOWER on IntegriCloud