summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAG] Avoid smart constructor-based dangling nodes.Nirav Dave2019-03-262-0/+15
| | | | | | | | | | | | | | | Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations or not fully pruning unused result values. This can result in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter as the current node deleter to make sure such nodes have the chance of being pruned. Many minor changes, mostly positive. llvm-svn: 356996
* Moved body of methods dump to .cpp file to fix compilation when modulesMikhail R. Gadelha2019-03-261-0/+4
| | | | | | are enabled llvm-svn: 356994
* [RISCV] Improve codegen for icmp {ne,eq} with a constantLuis Marques2019-03-261-0/+4
| | | | | | | | | Adds two patterns to improve the codegen of GPR value comparisons with small constants. Instead of first loading the constant into another register and then doing an XOR of those registers, these patterns directly use the constant as an XORI immediate. llvm-svn: 356990
* [TargetLowering] Add SimplifyDemandedBits support for ISD::INSERT_VECTOR_ELTSimon Pilgrim2019-03-262-3/+45
| | | | | | | | | | | | This helps us relax the extension of a lot of scalar elements before they are inserted into a vector. Its exposes an issue in DAGCombiner::convertBuildVecZextToZext as some/all the zero-extensions may be relaxed to ANY_EXTEND, so we need to handle that case to avoid a couple of AVX2 VPMOVZX test regressions. Once this is in it should be easier to fix a number of remaining failures to fold loads into VBROADCAST nodes. Differential Revision: https://reviews.llvm.org/D59484 llvm-svn: 356989
* Fix nondeterminism introduced in r353954Yi Kong2019-03-262-2/+3
| | | | | | | | | | DenseMap iteration order is not guaranteed, use MapVector instead. Fix provided by srhines. Differential Revision: https://reviews.llvm.org/D59807 llvm-svn: 356988
* [TableGen] Let list elements have a trailing commaJaved Absar2019-03-261-0/+4
| | | | | | | | | | | Let lists have an trailing comma to allow cleaner diffs e.g: def : Features<[FeatureA, FeatureB, ]>; Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D59247 llvm-svn: 356986
* [TableGen] Give meaningful msg for def use in multiclassJaved Absar2019-03-261-2/+8
| | | | | | | | | | | | | When one mistakenly specifies 'def' instead of using 'defm', the error message is quite misleading: 'Couldn't find class..' Instead, it should recommend using defm if the multiclass of same name exists. Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D59294 llvm-svn: 356985
* [ARM][Asm] Accept upper case coprocessor number and registersOliver Stannard2019-03-261-2/+2
| | | | | | Differential revision: https://reviews.llvm.org/D59760 llvm-svn: 356984
* [llvm-dlltool] Set a proper machine type for weak symbol object filesMartin Storsjo2019-03-261-1/+1
| | | | | | | | | | | | | | | This makes GNU binutils not reject the libraries outright. GNU ld handles weak externals slightly differently though, so it can't use them for aliases in import libraries, but this makes GNU ld able to use the rest of the import libraries. LLD accepted object files with machine type 0 aka IMAGE_FILE_MACHINE_UNKNOWN. Differential Revision: https://reviews.llvm.org/D59742 llvm-svn: 356982
* [X86] In matchBitExtract, place all of the new nodes before Node's position ↵Craig Topper2019-03-261-10/+9
| | | | | | | | | | in the DAG for the topological sort. We were using OrigNBits, but that put all the nodes before the node we used to start the control computation. This caused some node earlier than the sequence we inserted to be selected before the sequence we created. We want our new sequence to be selected first since it depends on OrigNBits. I don't have a test case. Found by reviewing the code. llvm-svn: 356979
* [X86] In matchBitExtract, if we need to truncate the BEXTR make sure we put ↵Craig Topper2019-03-261-1/+1
| | | | | | | | the BEXTR at Node's position in the DAG for the topological sort. We were using OrigNBits, but that doesn't guarantee that it will be selected before the nodes that make up X. llvm-svn: 356978
* [X86] Remove unneeded FIXME. NFCCraig Topper2019-03-261-2/+0
| | | | | | We do fold loads right below this. llvm-svn: 356977
* X86Parser: Fix potential reference to deleted objectCraig Topper2019-03-261-9/+9
| | | | | | | | Within the MatchFPUWaitAlias function, Operands[0] is potentially overwritten leading to &Op referencing a deleted object. To fix this, assign the reference after the function. Differential Revision: https://reviews.llvm.org/D57376 llvm-svn: 356973
* X86AsmParser: Do not process a non-existent tokenCraig Topper2019-03-261-0/+5
| | | | | | | | | | This error can only happen if an unfinished operation is at Eof. Patch by Brandon Jones Differential Revision: https://reviews.llvm.org/D57379 llvm-svn: 356972
* [ARM] Add missing memory operands to a bunch of instructions.Eli Friedman2019-03-253-25/+59
| | | | | | | | | | | | | | This should hopefully lead to minor improvements in code generation, and more accurate spill/reload comments in assembly. Also fix isLoadFromStackSlotPostFE/isStoreToStackSlotPostFE so they don't lead to misleading assembly comments for merged memory operands; this is technically orthogonal, but in practice the relevant memory operand lists don't show up without this change. Differential Revision: https://reviews.llvm.org/D59713 llvm-svn: 356963
* Revert "AMDGPU: Scavenge register instead of findUnusedReg"Matt Arsenault2019-03-251-1/+1
| | | | | | | | This reverts r356149. This is crashing on rocBLAS. llvm-svn: 356958
* AMDGPU: Remove unnecessary check for isFullCopyMatt Arsenault2019-03-251-1/+1
| | | | | | | Subregister indexes are not used for physical register operands, so isFullCopy is implied by the physical register check. llvm-svn: 356956
* [AArch64] Prefer "mov" over "orr" to materialize constants.Eli Friedman2019-03-251-2/+5
| | | | | | | | | | | | | This is generally more readable due to the way the assembler aliases work. (This causes a lot of test changes, but it's not really as scary as it looks at first glance; it's just mechanically changing a bunch of checks for orr to check for mov instead.) Differential Revision: https://reviews.llvm.org/D59720 llvm-svn: 356954
* AMDGPU: Set hasSideEffects 0 on _term instructionsMatt Arsenault2019-03-251-0/+3
| | | | | | | | These were defaulting to true, but they are just wrappers around bit operations. This avoids regressions in the exec mask optimization passes in a future commit. llvm-svn: 356952
* Revert "[llvm] Prevent duplicate files in debug line header in dwarf 5."Ali Tamur2019-03-256-33/+6
| | | | | | | | This reverts commit 312ab05887d0e2caa29aaf843cefe39379a98d36. My commit broke the build; I will revert and find out what happened. llvm-svn: 356951
* [LLVM-C] Add binding to look up intrinsic by nameRobert Widmann2019-03-251-0/+4
| | | | | | | | | | | | | | | | Summary: Add a binding to Function::lookupIntrinsicID so clients don't have to go searching the ID table themselves. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59697 llvm-svn: 356948
* AMDGPU: Add support for cross address space synchronization scopesKonstantin Zhuravlyov2019-03-253-32/+101
| | | | | | Differential Revision: https://reviews.llvm.org/D59517 llvm-svn: 356946
* [llvm] Prevent duplicate files in debug line header in dwarf 5.Ali Tamur2019-03-256-6/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Motivation: In previous dwarf versions, file name indexes started from 1, and the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes the primary source file to be explicitly given an entry with an index number 0. The current implementation honors the specification by just duplicating the main source file, once with index number 0, and later maybe with another index number. While this is compliant with the letter of the standard, the duplication causes problems for consumers of this information such as lldb. (Some files are duplicated, where only some of them have a line table although all refer to the same file) With this change, dwarf 5 debug line section files always start from 0, and the zeroth entry is not duplicated whenever possible. This requires different handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5) However, I think the minor complication is worth it, because it enables all consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the file name list homogenously. Reviewers: dblaikie, probinson, aprantl, espindola Reviewed By: probinson Subscribers: emaste, jvesely, nhaehnle, aprantl, javed.absar, arichardson, hiraditya, MaskRay, rupprecht, jdoerfert, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D59515 llvm-svn: 356941
* [SLPVectorizer] Merge reorderAltShuffleOperands into ↵Simon Pilgrim2019-03-251-85/+35
| | | | | | | | | | reorderInputsAccordingToOpcode As discussed on D59738, this generalizes reorderInputsAccordingToOpcode to handle multiple + non-commutative instructions so we can get rid of reorderAltShuffleOperands and make use of the extra canonicalizations that reorderInputsAccordingToOpcode brings. Differential Revision: https://reviews.llvm.org/D59784 llvm-svn: 356939
* [SelectionDAG] Add icmp UNDEF handling to SelectionDAG::FoldSetCCSimon Pilgrim2019-03-251-3/+19
| | | | | | | | | | First half of PR40800, this patch adds DAG undef handling to icmp instructions to match the behaviour in llvm::ConstantFoldCompareInstruction and SimplifyICmpInst, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.). This involved a lot of tweaking to reduced tests as bugpoint loves to reduce icmp arguments to undef........ Differential Revision: https://reviews.llvm.org/D59363 llvm-svn: 356938
* [CGP] Build the DominatorTree lazilyTeresa Johnson2019-03-251-34/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In r355512 CGP was changed to build the DominatorTree only once per function traversal, to avoid repeatedly building it each time it was accessed. This solved one compile time issue but introduced another. In the second case, we now were building the DT unnecessarily many times when we performed many function traversals (i.e. more than once per function when running CGP because of changes made each time). Change to saving the DT in the CodeGenPrepare object, and building it lazily when needed. It is reset whenever we need to rebuild it. The case that exposed the issue there are 617 functions, and we walk them (i.e. execute the "while (MadeChange)" loop in runOnFunction) a total of 12083 times (so previously we were building the DT 12083 times). With this patch we only build the DT 844 times (average of 1.37 times per function). We dropped the total time to compile this file from 538.11s without this patch to 339.63s with it. There is still an issue as CGP is taking much longer than all other passes even with this patch, and before a recent compiler release cut at r355392 the total time to this compile was only 97 sec with a huge reduction in CGP time. I suspect that one of the other recent changes to CGP led to iterating each function many more times on average, but I need to do some more investigation. Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59696 llvm-svn: 356937
* Moved everything SMT-related to LLVM and updated the cmake scripts.Mikhail R. Gadelha2019-03-252-1/+842
| | | | | | Differential Revision: https://reviews.llvm.org/D54978 llvm-svn: 356929
* MISched: Don't schedule regions with 0 instructionsMatt Arsenault2019-03-251-2/+6
| | | | | | | | | | | | | | | | | I think this is correct, but may not necessarily be the correct fix for the assertion I'm really trying to solve. If a scheduling region was found that only has dbg_value instructions, the RegPressure tracker would end up in an inconsistent state because it would skip over any debug instructions and point to an instruction outside of the scheduling region. It may still be possible for this to happen if there are some real schedulable instructions between dbg_values, but I haven't managed to break this. The testcase is extremely sensitive and I'm not sure how to make it more resistent to future scheduler changes that would avoid stressing this situation. llvm-svn: 356926
* AMDGPU: Preserve LiveIntervals in WQMMatt Arsenault2019-03-251-0/+2
| | | | | | This seems to already be done, but wasn't marked. llvm-svn: 356922
* [SLPVectorizer] reorderInputsAccordingToOpcode - remove non-Instruction ↵Simon Pilgrim2019-03-251-7/+2
| | | | | | | | | | | | | | canonicalization Remove attempts to commute non-Instructions to the LHS - the codegen changes appear to rely on chance more than anything else and also have a tendency to fight existing instcombine canonicalization which moves constants to the RHS of commutable binary ops. This is prep work towards: (a) reusing reorderInputsAccordingToOpcode for alt-shuffles and removing the similar reorderAltShuffleOperands (b) improving reordering to optimized cases with commutable and non-commutable instructions to still find splat/consecutive ops. Differential Revision: https://reviews.llvm.org/D59738 llvm-svn: 356913
* MinidumpYAML.cpp: Fix some code standard violations missed during reviewPavel Labath2019-03-251-12/+12
| | | | | | functions should begin with lower case letters. NFC. llvm-svn: 356901
* [DebugInfo] IntelJitEventListener follow up for "add SectionedAddress ..."Brock Wyma2019-03-251-3/+13
| | | | | | | | | | | | Following r354972 the Intel JIT Listener would not report line table information because the section indices did not match. There was a similar issue with the PerfJitEventListener. This change performs the section index lookup when building the object address used to query the line table information. Differential Revision: https://reviews.llvm.org/D59490 llvm-svn: 356895
* [MIPS GlobalISel] Select copy for arguments from FPRBRegBankPetar Avramovic2019-03-251-5/+15
| | | | | | | | | Move selectCopy into MipsInstructionSelector class. Select copy for arguments from FPRBRegBank for MIPS32. Differential Revision: https://reviews.llvm.org/D59644 llvm-svn: 356886
* [MIPS GlobalISel] Add floating point register bankPetar Avramovic2019-03-252-0/+7
| | | | | | | | | Add floating point register bank for MIPS32. Implement getRegBankFromRegClass for float register classes. Differential Revision: https://reviews.llvm.org/D59643 llvm-svn: 356883
* [MIPS GlobalISel] Lower float and double arguments in registersPetar Avramovic2019-03-252-36/+98
| | | | | | | | | | Lower float and double arguments in registers for MIPS32. When float/double argument is passed through gpr registers select appropriate move instruction. Differential Revision: https://reviews.llvm.org/D59642 llvm-svn: 356882
* Fix the build with GCC 4.8 after r356783Hans Wennborg2019-03-251-1/+1
| | | | llvm-svn: 356875
* [ARM GlobalISel] 64-bit memops should be alignedDiana Picus2019-03-251-9/+10
| | | | | | | | | | We currently use only VLDR/VSTR for all 64-bit loads/stores, so the memory operands must be word-aligned. Mark aligned operations as legal and narrow non-aligned ones to 32 bits. While we're here, also mark non-power-of-2 loads/stores as unsupported. llvm-svn: 356872
* [X86] Update some of the getMachineNode calls from X86ISelDAGToDAG to also ↵Craig Topper2019-03-251-8/+9
| | | | | | | | | include a VT for a EFLAGS result. This makes the nodes consistent with how they would be emitted from the isel table. llvm-svn: 356870
* [X86] When selecting (x << C1) op C2 as (x op (C2>>C1)) << C1, use the ↵Craig Topper2019-03-251-1/+2
| | | | | | | | | | | | | operation VT for the target constant. Normally when the nodes we use here(AND32ri8 for example) are selected their immediates are just converted from ConstantSDNode to TargetConstantSDNode without changing VT from the original operation VT. So we should still be emitting them with the operation VT. Theoretically this could expose more accurate opportunities for CSE. llvm-svn: 356869
* [X86] Remove GetLo8XForm and use GetLo32XForm instead. NFCICraig Topper2019-03-251-6/+1
| | | | | | | | We were using this to create an AND32ri8 node from a 64-bit and, but that node normally still uses a 32-bit immediate. So we should just truncate the existing immediate to i32. We already verified it has the same value in bits 31:7. llvm-svn: 356868
* [X86] Remove a couple unused SDNodeXForms. NFCCraig Topper2019-03-251-11/+0
| | | | llvm-svn: 356867
* Revert r356688 "[X86] Don't avoid folding multiple use sign extended 8-bit ↵Craig Topper2019-03-253-5/+18
| | | | | | | | immediate into instructions under optsize." Looking back over how the one use optimization works, I don't think this is the right way to fix this. llvm-svn: 356866
* [X86][SSE41] Start shuffle combining from ZERO_EXTEND_VECTOR_INREG (PR40685)Simon Pilgrim2019-03-241-28/+33
| | | | | | | | Enable SSE41 ZERO_EXTEND_VECTOR_INREG shuffle combines - for the PMOVZX(PSHUFD(V)) -> UNPCKH(V,0) pattern we reduce the shuffles (port5-bottleneck on Intel) at the expense of creating a zero (pxor v,v) and an extra register move - which is a good trade off as these are pretty cheap and in most cases it doesn't increase register pressure. This also exposed a missed opportunity to use combine to ZERO_EXTEND_VECTOR_INREG with folded loads - even if we're in the float domain. llvm-svn: 356864
* [WebAssembly] Rename a variable in CFGSort (NFC)Heejin Ahn2019-03-241-4/+4
| | | | | | | | Class `RegionInfo` was `SortUnitInfo` before, so the variables were named `SUI`. Now the class name is `RegionInfo`, so this renames `SUI` to `RI`, matching the class name. llvm-svn: 356861
* [LegalizeDAG] Expand i16 bswap directly to a rotate by 8 instead of relying ↵Craig Topper2019-03-241-3/+2
| | | | | | | | | | | | | | | | on DAG combine. An i16 bswap can be implemented with an i16 rotate by 8. We previously emitted a shift and OR sequence that DAG combine should be able to turn back into rotate. But we might as well go there directly. If rotate isn't legal, LegalizeDAG should further legalize it to either the opposite rotate, or the shift and OR pattern. I don't know of any way to get the existing DAG combine reliance to fail. So I don't know any way to add new tests for this that wouldn't have worked previously. llvm-svn: 356860
* [X86][AVX] Start shuffle combining from ZERO_EXTEND_VECTOR_INREG (PR40685)Simon Pilgrim2019-03-241-2/+14
| | | | | | | | Just enable this for AVX for now as SSE41 introduces extra register moves for the PMOVZX(PSHUFD(V)) -> UNPCKH(V,0) pattern (but otherwise helps reduce port5 usage on Intel targets). Only AVX support is required for PR40685 as the issue is due to 8i8->8i32 zext shuffle leftovers. llvm-svn: 356858
* [CGP] Make several static functions member functions (NFC)Teresa Johnson2019-03-241-19/+25
| | | | | | | This is extracted from D59696 as suggested in the review. It is preparation for making the DominatorTree a member variable. llvm-svn: 356857
* [x86] improve the default expansion of uaddsat/usubsatSanjay Patel2019-03-241-3/+31
| | | | | | | | | | | | | | | This is yet another step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 uaddsat X, Y --> (X >u (X + Y)) ? -1 : X + Y usubsat X, Y --> (X >u Y) ? X - Y : 0 We can't count on a sane vector ISA, so override the default (umin/umax) expansion of unsigned add/sub saturate in cases where we do not have umin/umax. Differential Revision: https://reviews.llvm.org/D59006 llvm-svn: 356855
* [SLPVectorizer] shouldReorderOperands - just check for reordering. NFCI.Simon Pilgrim2019-03-241-28/+24
| | | | | | Remove the I.getOperand() calls from inside shouldReorderOperands - reorderInputsAccordingToOpcode should handle the creation of the operand lists and shouldReorderOperands should just check to see whether the i'th element should be commuted. llvm-svn: 356854
* [ConstantRange] Add getFull() + getEmpty() named constructors; NFCNikita Popov2019-03-246-81/+81
| | | | | | | | | | | | | | | | This adds ConstantRange::getFull(BitWidth) and ConstantRange::getEmpty(BitWidth) named constructors as more readable alternatives to the current ConstantRange(BitWidth, /* full */ false) and similar. Additionally private getFull() and getEmpty() member functions are added which return a full/empty range with the same bit width -- these are commonly needed inside ConstantRange.cpp. The IsFullSet argument in the ConstantRange(BitWidth, IsFullSet) constructor is now mandatory for the few usages that still make use of it. Differential Revision: https://reviews.llvm.org/D59716 llvm-svn: 356852
OpenPOWER on IntegriCloud