summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Custom lower vector_shuffle for v4i16/v4f16Matt Arsenault2019-07-022-0/+63
| | | | | | | | | Ordinarily it is lowered as a build_vector of each extract_vector_elt, which in turn get lowered to bitcasts and bit shifts. Very little understand the lowered extract pattern, resulting in much worse code. We treat concat_vectors of v2i16 as legal, so prefer that. llvm-svn: 364959
* [RA] Fix spelling of Greedy register allocator internal optionTeresa Johnson2019-07-021-1/+1
| | | | | | | | The internal option added with r323870 has a typo. It isn't being used by any tests, but I decided to fix the spelling and leave it in for use in debugging the changes added in that patch. llvm-svn: 364958
* [C++2a] Add __builtin_bit_cast, used to implement std::bit_castErik Pilkington2019-07-022-53/+53
| | | | | | | | | | | | | | | | | | This commit adds a new builtin, __builtin_bit_cast(T, v), which performs a bit_cast from a value v to a type T. This expression can be evaluated at compile time under specific circumstances. The compile time evaluation currently doesn't support bit-fields, but I'm planning on fixing this in a follow up (some of the logic for figuring this out is in CodeGen). I'm also planning follow-ups for supporting some more esoteric types that the constexpr evaluator supports, as well as extending __builtin_memcpy constexpr evaluation to use the same infrastructure. rdar://44987528 Differential revision: https://reviews.llvm.org/D62825 llvm-svn: 364954
* [X86] getTargetConstantBitsFromNode - remove unnecessary getZExtValue() ↵Simon Pilgrim2019-07-021-2/+1
| | | | | | | | | | | | (PR42486) Don't use APInt::getZExtValue() if you can avoid it - eventually someone will call it with i128 or something that doesn't fit into 64-bits. In this case it was completely superfluous as we'd moved the rest of the code to always use APInt. Fixes the <1 x i128> addition bug in PR42486 llvm-svn: 364953
* [AMDGPU] LCSSA pass added in preISel. Fixing typo in previous commitAlexander Timofeev2019-07-021-1/+1
| | | | llvm-svn: 364952
* [AMDGPU] LCSSA pass added in preISel. Uniform values defined in the ↵Alexander Timofeev2019-07-022-0/+19
| | | | | | | | | divergent loop and used outside Differential Revision: https://reviews.llvm.org/D63953 Reviewers: rampitec, nhaehnle, arsenm llvm-svn: 364950
* [X86] Add patterns to select (scalar_to_vector (loadf32)) as (V)MOVSSrm ↵Craig Topper2019-07-022-9/+24
| | | | | | | | | | instead of COPY_TO_REGCLASS + (V)MOVSSrm_alt. Similar for (V)MOVSD. Ultimately, I'd like to see about folding scalar_to_vector+load to vzload. Which would select as (V)MOVSSrm so this is closer to that. llvm-svn: 364948
* [SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n)David Bolvansky2019-07-021-12/+47
| | | | | | | | | | | | | | | | | | | Summary: Partially solves https://bugs.llvm.org/show_bug.cgi?id=42190 Reviewers: spatel, nikic, efriedma Reviewed By: efriedma Subscribers: efriedma, nikic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63038 llvm-svn: 364940
* Provide basic Full LTO extension pointsSerge Guelton2019-07-021-0/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D61738 llvm-svn: 364937
* getMainExecutable: handle realpath() failure, falling back to getprogpath().Sam McCall2019-07-021-10/+10
| | | | | | | | | | | | | | | | | Summary: Previously, we'd pass a nullptr to std::string and crash(). This case happens when the binary is deleted while being used (e.g. rebuilding clangd). Reviewers: kadircet Subscribers: ilya-biryukov, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64068 llvm-svn: 364936
* AMDGPU/GlobalISel: Try generated matcher with intrinsicsMatt Arsenault2019-07-021-8/+7
| | | | llvm-svn: 364933
* AMDGPU/GlobalISel: Select mulMatt Arsenault2019-07-021-1/+1
| | | | llvm-svn: 364932
* AMDGPU/GlobalISel: Fix G_GEP with mixed SGPR/VGPR operandsMatt Arsenault2019-07-021-4/+6
| | | | | | | | The register bank for the destination of the sample argument copy was wrong. We shouldn't be constraining each source to the result register bank. Allow constraining the original register to the right size. llvm-svn: 364928
* AMDGPU/GlobalISel: Select G_FENCEMatt Arsenault2019-07-021-0/+5
| | | | | | | Manually select to workaround tablegen emitter emitting checks for G_CONSTANT. llvm-svn: 364927
* GlobalISel: Add G_FENCEMatt Arsenault2019-07-022-0/+15
| | | | | | | The pattern importer is for some reason emitting checks for G_CONSTANT for the immediate operands. llvm-svn: 364926
* [X86][AVX] combineX86ShuffleChain - pull out CombineShuffleWithExtract ↵Simon Pilgrim2019-07-021-105/+116
| | | | | | | | lambda. NFCI. Pull out CombineShuffleWithExtract lambda to new combineX86ShuffleChainWithExtract wrapper and refactored it to handle more than 2 shuffle inputs - this will allow combineX86ShufflesRecursively to call this in a future patch. llvm-svn: 364924
* [NFC][TargetLowering] Some preparatory cleanups around 'prepareUREMEqFold()' ↵Roman Lebedev2019-07-021-17/+18
| | | | | | from D63963 llvm-svn: 364921
* [InstCombine] Shift amount reassociation: fixup constantexpr handling (PR42484)Roman Lebedev2019-07-021-3/+3
| | | | | | | | | | | | | | | I was actually wondering if there was some nicer way than m_Value()+cast, but apparently what i was really "subconsciously" thinking about was correctness issue. hasNoUnsignedWrap()/hasNoUnsignedWrap() exist for Instruction, not for BinaryOperator, so let's just use m_Instruction(), thus both avoiding a cast, and a crash. Fixes https://bugs.llvm.org/show_bug.cgi?id=42484, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=15587 llvm-svn: 364915
* [llvm] [Support] Clean PrintStackTrace() ptr arithmetic upMichal Gorny2019-07-021-5/+2
| | | | | | | | | | | | | | | | | Use '%tu' modifier for pointer arithmetic since we are using C++11 already. Prefer static_cast<> over C-style cast. Remove unnecessary conversion of result, and add const qualifier to converted pointers, to silence the following warning: In file included from /home/mgorny/llvm-project/llvm/lib/Support/Signals.cpp:220:0: /home/mgorny/llvm-project/llvm/lib/Support/Unix/Signals.inc: In function ‘void llvm::sys::PrintStackTrace(llvm::raw_ostream&)’: /home/mgorny/llvm-project/llvm/lib/Support/Unix/Signals.inc:546:53: warning: cast from type ‘const void*’ to type ‘char*’ casts away qualifiers [-Wcast-qual] (char*)dlinfo.dli_saddr)); ^~~~~~~~~ Differential Revision: https://reviews.llvm.org/D63888 llvm-svn: 364912
* [IDF] Generalize IDFCalculator to be used with Clang's CFGKristof Umann2019-07-022-105/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm currently working on a GSoC project that aims to improve the the bug reports of the analyzer. The main heuristic I plan to use is to explain values that are a control dependency of the bug location better. 01 bool b = messyComputation(); 02 int i = 0; 03 if (b) // control dependency of the bug site, let's explain why we assume val 04 // to be true 05 10 / i; // warn: division by zero Because of this, I'd like to generalize IDFCalculator so that I could use it for Clang's CFG: D62883. In detail: * Rename IDFCalculator to IDFCalculatorBase, make it take a general CFG node type as a template argument rather then strictly BasicBlock (but preserve ForwardIDFCalculator and ReverseIDFCalculator) * Move IDFCalculatorBase from llvm/include/llvm/Analysis to llvm/include/llvm/Support (but leave the BasicBlock variants in llvm/include/llvm/Analysis) * clang-format the file since this patch messes up git blame anyways * Change typedef to using * Add the new type ChildrenGetterTy, and store an instance of it in IDFCalculatorBase. This is important because I'll have to specialize it for Clang's CFG to filter out nullpointer successors, similarly to D62507. Differential Revision: https://reviews.llvm.org/D63389 llvm-svn: 364911
* [ARM] MVE: allow soft-float ABI to pass vector types.Simon Tatham2019-07-023-2/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | Passing a vector type over the soft-float ABI involves it being split into four GPRs, so the first thing that has to happen at the start of the function is to recombine those into a vector register. The ABI types all vectors as v2f64, so we need to support BUILD_VECTOR for that type, which I do in this patch by allowing it to be expanded in terms of INSERT_VECTOR_ELT, and writing an ISel pattern for that in turn. Similarly, I provide a rule for EXTRACT_VECTOR_ELT so that a returned vector can be marshalled back into GPRs. While I'm here, I've also added ISD::UNDEF to the list of operations we turn back on in `setAllExpand`, because I noticed that otherwise it gets expanded into a BUILD_VECTOR with explicit zero inputs, leading to pointless machine instructions to zero out a vector register that's about to have every lane overwritten of in any case. Reviewers: dmgreen, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63937 llvm-svn: 364910
* [ARM] Stop using scalar FP instructions in integer-only MVE mode.Simon Tatham2019-07-023-18/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If you compile with `-mattr=+mve` (enabling integer MVE instructions but not floating-point ones), then the scalar FP //registers// exist and it's legal to move things in and out of them, load and store them, but it's not legal to do arithmetic on them. In D60708, the calls to `addRegisterClass` in ARMISelLowering that enable use of the scalar FP registers became conditionalised on `Subtarget->hasFPRegs()` instead of `Subtarget->hasVFP2Base()`, so that loads, stores and moves of those registers would work. But I didn't realise that that would also enable all the operations on those types by default. Now, if the target doesn't have basic VFP, we follow up those `addRegisterClass` calls by turning back off all the nontrivial operations you can perform on f32 and f64. That causes several knock-on failures, which are fixed by allowing the `VMOVDcc` and `VMOVScc` instructions to be selected even if all you have is `HasFPRegs`, and adjusting several checks for 'is this a double in a single-precision-only world?' to the more general 'is this any FP type we can't do arithmetic on?'. Between those, the whole of the `float-ops.ll` and `fp16-instructions.ll` tests can now run in MVE-without-FP mode and generate correct-looking code. One odd side effect is that I had to relax the check lines in that test so that they permit test functions like `add_f` to be generated as tailcalls to software FP library functions, instead of ordinary calls. Doing that is entirely legal, but the mystery is why this is the first RUN line that's needed the relaxation: on the usual kind of non-FP target, no tailcalls ever seem to be generated. Going by the llc messages, I think `SoftenFloatResult` must be perturbing the code generation in some way, but that's as much as I can guess. Reviewers: dmgreen, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63938 llvm-svn: 364909
* [X86] resolveTargetShuffleInputsAndMask - add repeated input handling.Simon Pilgrim2019-07-021-7/+22
| | | | | | We were relying on combineX86ShufflesRecursively to handle this - this patch gets it done earlier which should make it easier for other code to use resolveTargetShuffleInputsAndMask. llvm-svn: 364906
* [mips] Mark P5600 scheduling model as completeSimon Atanasyan2019-07-021-1/+1
| | | | llvm-svn: 364902
* [mips] Add missing schedinfo for FPU load/store/conv instructionsSimon Atanasyan2019-07-021-4/+10
| | | | llvm-svn: 364900
* [mips] Map SNOP, NOP to the P5600Nop scheduler resourceSimon Atanasyan2019-07-021-2/+8
| | | | llvm-svn: 364899
* [yaml2obj] - Allow overriding sh_offset field from the YAML.George Rimar2019-07-021-0/+6
| | | | | | | | | | | Some of our test cases are using objects which has sections with a broken sh_offset field. There was no way to set it from YAML until this patch. Differential revision: https://reviews.llvm.org/D63879 llvm-svn: 364898
* [DWARF] Simplify dumping of a .debug_addr section.Igor Kudrin2019-07-021-21/+6
| | | | | | | | | This patch removes the part which tried to interpret addresses in that section as offsets and simplifies the remaining code. Differential Revision: https://reviews.llvm.org/D64020 llvm-svn: 364896
* [TailDuplicator] Fix copy instruction emitting into the wrong block.Amara Emerson2019-07-021-1/+1
| | | | | | | | | | | | | The code for duplicating instructions could sometimes try to emit copies intended to deal with unconstrainable register classes to the tail block of the original instruction, rather than before the newly cloned instruction in the predecessor block. This was exposed by GlobalISel on arm64. Differential Revision: https://reviews.llvm.org/D64049 llvm-svn: 364888
* [X86] Add PreprocessISelDAG support for turning ISD::FP_TO_SINT/UINT into ↵Craig Topper2019-07-023-131/+30
| | | | | | X86ISD::CVTTP2SI/CVTTP2UI and to reduce the number of isel patterns. llvm-svn: 364887
* [PowerPC] Implement the areMemAccessesTriviallyDisjoint hookQingShan Zhang2019-07-022-0/+72
| | | | | | | | | After implemented this hook, we will model the memory dependency in the scheduling dependency graph more precise, and will have more opportunity to reorder the load/stores, as they didn't have the dependency at some condition Differential Revision: https://reviews.llvm.org/D63804 llvm-svn: 364886
* [DAGCombiner] Exploiting more about the transformation of ↵Zi Xuan Wu2019-07-021-4/+2
| | | | | | | | | | | | | | | | TransformFPLoadStorePair function For a given floating point load / store pair, if the load value isn't used by any other operations, then consider transforming the pair to integer load / store operations if the target deems the transformation profitable. And we can exploiting much more when there are other operation nodes with chain operand between the load/store pair so long as we keep the chain ordering original. We only replace the register used to load/store from float to integer. I only add testcase in ARM because the TLI.isDesirableToTransformToIntegerOp hook is only enabled in ARM target. Differential Revision: https://reviews.llvm.org/D60601 llvm-svn: 364883
* Revert Recommit [PowerPC] Update P9 vector costs for insert/extract elementJordan Rupprecht2019-07-011-29/+0
| | | | | | | | This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e) This crashes as reported on the commit thread. Repro instructions TBD. llvm-svn: 364876
* [PGO] Update ICP pass for recent byval type changesReid Kleckner2019-07-011-0/+9
| | | | | | | | | | Fixes verifier errors encountered in PR42413. Reviewers: xur, t.p.northover, inglorion, gbiv, george.burgess.iv Differential Revision: https://reviews.llvm.org/D63842 llvm-svn: 364861
* AMDGPU: Correct properties for adjcallstack* pseudosMatt Arsenault2019-07-011-0/+4
| | | | | | | These should be SALU writes, and these are lowered to instructions that def SCC. llvm-svn: 364859
* [InstCombine] reduce more checks for power-of-2-or-zero using ctpopSanjay Patel2019-07-011-1/+7
| | | | | | | | | Extends the transform from: rL364341 ...to include another (more common?) pattern that tests whether a value is a power-of-2 (including or excluding zero). llvm-svn: 364856
* [X86] Use v4i32 vzloads instead of v2i64 for vpmovzx/vpmovsx patterns where ↵Craig Topper2019-07-013-9/+7
| | | | | | | | | | | | only 32-bits are loaded. v2i64 vzload defines a 64-bit memory access. It doesn't look like we have any coverage for this either way. Also remove some vzload usages where the instruction loads only 16-bits. llvm-svn: 364851
* [mips] Add missing schedinfo for MIPSeh_return[32|64] instructionsSimon Atanasyan2019-07-011-1/+1
| | | | llvm-svn: 364850
* [mips] Add virtualization ASE to P5600 scheduling definitionsSimon Atanasyan2019-07-011-0/+5
| | | | llvm-svn: 364849
* [mips] Add missing schedinfo for LONG_BRANCH_* instructionsSimon Atanasyan2019-07-012-11/+27
| | | | llvm-svn: 364848
* [X86] Remove several bad load folding isel patterns for VPMOVZX/VPMOVSX.Craig Topper2019-07-012-12/+0
| | | | | | | These patterns all matched a v2i64 vzload which only loads 64-bits to instructions that load a full 128-bits. llvm-svn: 364847
* Revert [SLP] Look-ahead operand reordering heuristic.Jordan Rupprecht2019-07-011-236/+46
| | | | | | | | This reverts r364478 (git commit 574cb0eb3a7ac95e62d223a60bef891171dfe321) The patch is causing compilation timeouts. llvm-svn: 364846
* Testing commit access through minor formatting changeNilanjana Basu2019-07-011-2/+3
| | | | llvm-svn: 364843
* GlobalISel: Try to widen merges with other mergesMatt Arsenault2019-07-011-2/+28
| | | | | | | | If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841
* [X86] Correct v4f32->v2i64 cvt(t)ps2(u)qq memory isel patternsCraig Topper2019-07-012-2/+93
| | | | | | | | | | | | | | | These instructions only read 64-bits of memory so we shouldn't allow a full vector width load to be pattern matched in case it is marked volatile. Instead allow vzload or scalar_to_vector+load. Also add a DAG combine to turn full vector loads into vzload when used by one of these instructions if the load isn't volatile. This fixes another case for PR42079 llvm-svn: 364838
* AMDGPU/GlobalISel: Handle more input argument intrinsicsMatt Arsenault2019-07-012-41/+72
| | | | llvm-svn: 364836
* AMDGPU/GlobalISel: Lower kernarg segment ptr intrinsicsMatt Arsenault2019-07-013-24/+48
| | | | llvm-svn: 364835
* AMDGPU/GlobalISel: Legalize workgroup ID intrinsicsMatt Arsenault2019-07-012-0/+36
| | | | llvm-svn: 364834
* AMDGPU/GlobalISel: Legalize workitem ID intrinsicsMatt Arsenault2019-07-013-0/+127
| | | | | | | | | Tests don't cover the masked input path since non-kernel arguments aren't lowered yet. Test is copied directly from the existing test, with 2 additions. llvm-svn: 364833
* AMDGPU/GlobalISel: Custom lower control flow intrinsicsMatt Arsenault2019-07-012-0/+68
| | | | | | | | Replace the brcond for the 2 cases that act as branches. For now follow how the current system works, although I think we can eventually get rid of the pseudos. llvm-svn: 364832
OpenPOWER on IntegriCloud