summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Make %eiz usage in 64-bit mode, force a 0x67 address size prefix. Fix ↵Craig Topper2018-06-231-0/+2
| | | | | | some test CHECK lines. llvm-svn: 335414
* [X86] Teach disassembler to use %eip instead of %rip when 0x67 prefix is ↵Craig Topper2018-06-231-1/+3
| | | | | | used on a rip-relative address. llvm-svn: 335413
* [X86][AsmParser] Improve base/index register checks.Craig Topper2018-06-231-8/+29
| | | | | | | | | -Ensure EIP isn't used with an index reigster. -Ensure EIP isn't used as index register. -Ensure base register isn't a vector register. -Ensure eiz/riz usage matches the size of their base register. llvm-svn: 335412
* Fix invariant fdiv hoisting in LICMStanislav Mekhanoshin2018-06-231-14/+14
| | | | | | | | | | | | | FDiv is replaced with multiplication by reciprocal and invariant reciprocal is hoisted out of the loop, while multiplication remains even if invariant. Switch checks for all invariant operands and only invariant denominator to fix the issue. Differential Revision: https://reviews.llvm.org/D48447 llvm-svn: 335411
* [AMDGPU] Update includes for intrinsic changes :(Reid Kleckner2018-06-232-4/+4
| | | | llvm-svn: 335409
* [ORC] Fix formatting and list pending queries in VSO::dump.Lang Hames2018-06-231-3/+7
| | | | llvm-svn: 335408
* [IR] Split Intrinsics.inc into enums and implementationsReid Kleckner2018-06-232-8/+9
| | | | | | | | | | | | | | | | | | | Implements PR34259 Intrinsics.h is a very popular header. Most LLVM TUs care about things like dbg_value, but they don't care how they are implemented. After I split these out, IntrinsicImpl.inc is 1.7 MB, so this saves each LLVM TU from scanning 1.7 MB of source that gets pre-processed away. It also means we can modify intrinsic properties without triggering a full rebuild, but that's probably less of a win. I think the next best thing to do would be to split out the target intrinsics into their own header. Very, very few TUs care about target-specific intrinsics. It's very hard to split up the target independent intrinsics like llvm.expect, assume, and dbg.value, though. llvm-svn: 335407
* [X86][AsmParser] Rework that allows (%dx) to be used in place of %dx with ↵Craig Topper2018-06-231-41/+29
| | | | | | | | | | in/out instructions. Previously, to support (%dx) we left a wide open hole in our 16-bit memory address checking. This let this address value be used with any instruction without error in the parser. It would later fail in the encoder with an assertion failure on debug builds and who knows what on release builds. This patch passes the mnemonic down to the memory operand parsing function so we can allow the (%dx) form only on specific instructions. llvm-svn: 335403
* [RuntimeDyld] Implement the ELF PIC large code model relocationsReid Kleckner2018-06-221-0/+43
| | | | | | | Prerequisite for https://reviews.llvm.org/D47211 which improves our ELF large PIC codegen. llvm-svn: 335402
* [LoopReroll] Rewrite induction variable rewriting.Eli Friedman2018-06-221-177/+59
| | | | | | | | | | | | | | | | | | | | This gets rid of a bunch of weird special cases; instead, just use SCEV rewriting for everything. In addition to being simpler, this fixes a bug where we would use the wrong stride in certain edge cases. The one bit I'm not quite sure about is the trip count handling, specifically the FIXME about overflow. In general, I think we need to widen the exit condition, but that's probably not profitable if the new type isn't legal, so we probably need a check somewhere. That said, I don't think I'm making the existing problem any worse. As a followup to this, a bunch of IV-related code in root-finding could be cleaned up; with SCEV-based rewriting, there isn't any reason to assume a loop will have exactly one or two PHI nodes. Differential Revision: https://reviews.llvm.org/D45191 llvm-svn: 335400
* [MSSA] Remove incorrect comment + `auto`ify dyn_cast results; NFCGeorge Burgess IV2018-06-221-6/+5
| | | | llvm-svn: 335399
* [X86][AsmParser] Keep track of whether an explicit scale was specified while ↵Craig Topper2018-06-221-8/+16
| | | | | | | | | | parsing an address in Intel syntax. Use it for improved error checking. This allows us to check these: -16-bit addressing doesn't support scale so we should error if we find one there. -Multiplying ESP/RSP by a scale even if the scale is 1 should be an error because ESP/RSP can't be an index. llvm-svn: 335398
* [X86][AsmParser] In Intel syntax make sure we support ESP/RSP being the ↵Craig Topper2018-06-221-0/+4
| | | | | | | | | | second register in memory expressions like [EAX+ESP]. By default, the second register gets assigned to the index register slot. But ESP can't be an index register so we need to swap it with the other register. There's still a slight bug that we allow [EAX+ESP*1]. The existence of the multiply even though its with 1 should force ESP to the index register and trigger an error, but it doesn't currently. llvm-svn: 335394
* Re-land "[LTO] Enable module summary emission by default for regular LTO"Tobias Edler von Koch2018-06-221-1/+5
| | | | | | | | | | | | Since we are now producing a summary also for regular LTO builds, we need to run the NameAnonGlobals pass in those cases as well (the summary cannot handle anonymous globals). See https://reviews.llvm.org/D34156 for details on the original change. This reverts commit 6c9ee4a4a438a8059aacc809b2dd57128fccd6b3. llvm-svn: 335385
* [X86] Don't accept (%si,%bp) 16-bit address expressions.Craig Topper2018-06-221-4/+9
| | | | | | | | | | The second register is the index register and should only be %si or %di if used with a base register. And in that case the base register should be %bp or %bx. This makes us compatible with gas. We do still need to support both orders with Intel syntax which uses [bp+si] and [si+bp] llvm-svn: 335384
* [X86][AsmParser] Allow (%bp,%si) and (%bp,%di) to be encoded without using a ↵Craig Topper2018-06-221-1/+1
| | | | | | | | zero displacement. (%bp) can't be encoded without a displacement. The encoding is instead used for displacement alone. So a 1 byte displacement of 0 must be used. But if there is an index register we can encode without a displacement. llvm-svn: 335379
* [X86][AsmParser] Check for invalid 16-bit base register in Intel syntax.Craig Topper2018-06-221-19/+24
| | | | llvm-svn: 335373
* [X86] Don't allow ESP/RSP to be used as an index register in assembly.Craig Topper2018-06-221-1/+2
| | | | | | Fixes PR37892 llvm-svn: 335370
* [LoopUnswitch]Fix comparison for DomTree updates.Alina Sbirlea2018-06-221-2/+3
| | | | | | | | | | | | | | | | | | | Summary: In LoopUnswitch when replacing a branch Parent -> Succ with a conditional branch Parent -> True & Parent->False, the DomTree updates should insert an edge for each of True/False if True/False are different than Succ, and delete Parent->Succ edge if both are different. The comparison with Succ appears to be incorect, it's comparing with Parent instead. There is no test failing either before or after this change, but it seems to me this is the right way to do the update. Reviewers: chandlerc, kuhar Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D48457 llvm-svn: 335369
* Initialize LiveRegs once in BranchFolder::mergeCommonTailsKrzysztof Parzyszek2018-06-221-1/+2
| | | | llvm-svn: 335365
* [SLPVectorizer] Support alternate opcodes in tryToVectorizeListSimon Pilgrim2018-06-221-28/+13
| | | | | | | | | | Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree. NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc. Differential Revision: https://reviews.llvm.org/D48488 llvm-svn: 335364
* [SLPVectorizer] reorderAltShuffleOperands should just take ↵Simon Pilgrim2018-06-221-7/+5
| | | | | | | | InstructionsState. NFCI. All calls were extracting the InstructionsState Opcode/AltOpcode values so we might as well pass it directly llvm-svn: 335359
* [DWARFv5] Allow ".loc 0" to refer to the root file.Paul Robinson2018-06-222-4/+6
| | | | | | | | | DWARF v5 explicitly represents file #0 in the line table. Prior versions did not, so ".loc 0" is still an error in those cases. Differential Revision: https://reviews.llvm.org/D48452 llvm-svn: 335350
* [SLPVectorizer] Relax alternate opcodes to accept any BinaryOperator pairSimon Pilgrim2018-06-221-27/+11
| | | | | | | | | | SLP currently only accepts (F)Add/(F)Sub alternate counterpart ops to be merged into an alternate shuffle. This patch relaxes this to accept any pair of BinaryOperator opcodes instead, assuming the target's cost model accepts the vectorization+shuffle. Differential Revision: https://reviews.llvm.org/D48477 llvm-svn: 335349
* [InstCombine] rearrange shuffle-of-binops logic; NFCSanjay Patel2018-06-221-17/+12
| | | | | | | | The commutative matcher makes things more complicated here, and I'm planning an enhancement where this form is more readable. llvm-svn: 335343
* Recommit r335333 "[MC] - Add .stack_size sections into groups and link them ↵George Rimar2018-06-222-1/+23
| | | | | | | | | | | | | | | | | | | | | | | with .text" With compilation fix. Original commit message: D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335336
* [IR] Use Instruction::isBinaryOp helper instead of raw enum range tests. NFCI.Simon Pilgrim2018-06-222-6/+3
| | | | llvm-svn: 335335
* Revert r335332 "[MC] - Add .stack_size sections into groups and link them ↵George Rimar2018-06-222-23/+1
| | | | | | | | | | | | with .text" It broke bots. http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891 http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443 http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551 llvm-svn: 335333
* [MC] - Add .stack_size sections into groups and link them with .textGeorge Rimar2018-06-222-1/+23
| | | | | | | | | | | | | | | | | D39788 added a '.stack-size' section containing metadata on function stack sizes to output ELF files behind the new -stack-size-section flag. This change does following two things on top: 1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to eliminate them fast during resolving the COMDATs. 2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text. With that linker will be able to do -gc-sections on dead stack sizes sections. Differential revision: https://reviews.llvm.org/D46874 llvm-svn: 335332
* Recommit of r335326, with the test fixed that I missed.Sjoerd Meijer2018-06-221-3/+6
| | | | llvm-svn: 335331
* [CostModel][AArch64] Add some initial costs for SK_Select and ↵Simon Pilgrim2018-06-221-17/+32
| | | | | | | | | | | | | | SK_PermuteSingleSrc AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion. This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174. I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more. Differential Revision: https://reviews.llvm.org/D48172 llvm-svn: 335329
* Reverting r335326 while I look at the test failureSjoerd Meijer2018-06-221-6/+3
| | | | llvm-svn: 335328
* Revert r335324 due to a builtbot failureEugene Leviant2018-06-221-30/+3
| | | | llvm-svn: 335327
* [ARM] ARMv6m and v8m.baseline strict alignSjoerd Meijer2018-06-221-3/+6
| | | | | | | | | | | | This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline, because it has no support for unaligned accesses. It looks like we always pass target feature "+strict-align" from Clang, so this is not a user facing problem, but querying the subtarget (in e.g. llc) for unaligned access support is incorrect. Differential Revision: https://reviews.llvm.org/D48437 llvm-svn: 335326
* AMDGPU: Add patterns for i32/i64 local atomic load/storeMatt Arsenault2018-06-224-1/+54
| | | | | | | | Not sure why the 32/64 split is needed in the atomic_load store hierarchies. The regular PatFrags do this, but we don't do it for the existing handling for global. llvm-svn: 335325
* [Evaluator] Improve evaluation of call instructionEugene Leviant2018-06-221-3/+30
| | | | | | Differential revision: https://reviews.llvm.org/D46584 llvm-svn: 335324
* [X86] Changing the check for valid inputs in combineScalarToVectorMikhail Dvoretckii2018-06-221-5/+6
| | | | | | | | | Changing the logic of scalar mask folding to check for valid input types rather than against invalid ones, making it more robust and fixing PR37879. Differential Revision: https://reviews.llvm.org/D48366 llvm-svn: 335323
* Revert r335306 (and r335314) - the Call Graph Profile pass.Chandler Carruth2018-06-226-181/+7
| | | | | | | | | | | This is the first pass in the main pipeline to use the legacy PM's ability to run function analyses "on demand". Unfortunately, it turns out there are bugs in that somewhat-hacky approach. At the very least, it leaks memory and doesn't support -debug-pass=Structure. Unclear if there are larger issues or not, but this should get the sanitizer bots back to green by fixing the memory leaks. llvm-svn: 335320
* AMDGPU/GlobalISel: Default to using TableGen'd instruction selectorTom Stellard2018-06-221-7/+0
| | | | | | | | | | | | | | | | Summary: We can select all instructions that are marked as legal in a full piglit run, so now is a good time to make the TableGen'd instruction selector default for all opcodes. This is NFC for a full piglit run, which is why there are no tests. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48198 llvm-svn: 335319
* AMDGPU/GlobalISel: legalize and select 32-bit G_ASHRTom Stellard2018-06-224-0/+47
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D48196 llvm-svn: 335318
* [LegacyPM] Fix PR37888 by teaching the legacy loop pass manager how toChandler Carruth2018-06-221-1/+10
| | | | | | | | | | | | | | | | | clear out deleted loops from the current queue beyond just the current loop. This is important because SimpleLoopUnswitch will now enqueue the same loop to be re-processed. When it does this with the legacy PM, we don't have a way of canceling the rest of the pipeline and so we can end up deleting the loop before we reprocess it. =/ This change also makes it easy to support deleting other loops in the queue to process, although I don't have any use cases for that. Differential Revision: https://reviews.llvm.org/D48470 llvm-svn: 335317
* AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFPTom Stellard2018-06-224-0/+18
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48195 llvm-svn: 335316
* AMDGPU/GlobalISel: Implement select() for COPYTom Stellard2018-06-221-1/+4
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46151 llvm-svn: 335315
* [InstCombine] fix shuffle-of-binops bugSanjay Patel2018-06-211-2/+8
| | | | | | | | | With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. llvm-svn: 335312
* AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEFTom Stellard2018-06-212-0/+16
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46150 llvm-svn: 335307
* [Instrumentation] Add Call Graph Profile passMichael J. Spencer2018-06-216-7/+181
| | | | | | | | | | | | | | | | | | | | This patch adds support for generating a call graph profile from Branch Frequency Info. The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335306
* [X86] Fix 32-bit mingw comdat names, only add one underscoreReid Kleckner2018-06-211-11/+6
| | | | llvm-svn: 335304
* Revert r335297 "[X86] Implement more of x86-64 large and medium PIC code models"Reid Kleckner2018-06-216-115/+29
| | | | | | MCJIT can't handle R_X86_64_GOT64 yet. llvm-svn: 335300
* [X86] Commit some comments that weren't in the medium code model patchReid Kleckner2018-06-211-4/+4
| | | | llvm-svn: 335298
* [X86] Implement more of x86-64 large and medium PIC code modelsReid Kleckner2018-06-216-27/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The large code model allows code and data segments to exceed 2GB, which means that some symbol references may require a displacement that cannot be encoded as a displacement from RIP. The large PIC model even relaxes the assumption that the GOT itself is within 2GB of all code. Therefore, we need a special code sequence to materialize it: .LtmpN: leaq .LtmpN(%rip), %rbx movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch addq %rax, %rbx # GOT base reg From that, non-local references go through the GOT base register instead of being PC-relative loads. Local references typically use GOTOFF symbols, like this: movq extern_gv@GOT(%rbx), %rax movq local_gv@GOTOFF(%rbx), %rax All calls end up being indirect: movabsq $local_fn@GOTOFF, %rax addq %rbx, %rax callq *%rax The medium code model retains the assumption that the code segment is less than 2GB, so calls are once again direct, and the RIP-relative loads can be used to access the GOT. Materializing the GOT is easy: leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg DSO local data accesses will use it: movq local_gv@GOTOFF(%rbx), %rax Non-local data accesses will use RIP-relative addressing, which means we may not always need to materialize the GOT base: movq extern_gv@GOTPCREL(%rip), %rax Direct calls are basically the same as they are in the small code model: They use direct, PC-relative addressing, and the PLT is used for calls to non-local functions. This patch adds reasonably comprehensive testing of LEA, but there are lots of interesting folding opportunities that are unimplemented. Reviewers: chandlerc, echristo Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47211 llvm-svn: 335297
OpenPOWER on IntegriCloud