summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [PDB] Add support for dumping Typedef records.Zachary Turner2018-10-015-0/+123
| | | | | | | | | | These work a little differently because they are actually in the globals stream and are treated as symbol records, even though DIA presents them as types. So this also adds the necessary infrastructure to cache records that live somewhere other than the TPI stream as well. llvm-svn: 343507
* [PDB] Add support for parsing VFTable Shape records.Zachary Turner2018-10-015-2/+53
| | | | | | This allows them to be returned from the native API. llvm-svn: 343506
* MIRParser: Check that instructions only reference DILocation metadataMatthias Braun2018-10-011-0/+2
| | | | llvm-svn: 343505
* [WebAssembly] Fixed AsmParser not allowing instructions with /Wouter van Oortmerssen2018-10-011-9/+28
| | | | | | | | | | | | | | | | | | | Summary: The AsmParser Lexer regards these as a seperate token. Here we expand the instruction name with them if they are adjacent (no whitespace). Tested: the basic-assembly.s test case has one case with a / in it. The currently are also instructions with : in them, which we intend to rename rather than fix them here. Reviewers: tlively, dschuff Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52442 llvm-svn: 343501
* [X86] Enable load folding in the test shrinking codeCraig Topper2018-10-011-9/+25
| | | | | | | | This patch adds load folding support to the test shrinking code. This was noticed missing in the review for D52669 Differential Revision: https://reviews.llvm.org/D52699 llvm-svn: 343499
* [X86] Improve test instruction shrinking when the sign flag is used and the ↵Craig Topper2018-10-011-5/+18
| | | | | | | | | | | | | | output of the and is truncated Currently we skip looking through truncates if the sign flag is used. But that's overly restrictive. It's safe to look through the truncate as long as we ensure one of the 3 things when we shrink. Either the MSB of the mask at the shrunken size isn't set. If the mask bit is set then either the shrunk size needs to be equal to the compare size or the sign flag needs to be unused. There are still missed opportunities to shrink a load and fold it in here. This will be fixed in a future patch. Differential Revision: https://reviews.llvm.org/D52669 llvm-svn: 343498
* [X86][Btver2] Fix BT(C|R|S)mr & BT(C|R|S)mi schedule latency + uop countsSimon Pilgrim2018-10-011-2/+2
| | | | | | Match AMD Fam16h SOG + llvm-exegesis tests llvm-svn: 343494
* DAGCombiner: StoreMerging: Fix bad index calculating when adjusting ↵Matthias Braun2018-10-011-17/+8
| | | | | | | | | | | | | | | | mismatching vector types This fixes a case of bad index calculation when merging mismatching vector types. This changes the existing code to just use the existing extract_{subvector|element} and a bitcast (instead of bitcast first and then newly created extract_xxx) so we don't need to adjust any indices in the first place. rdar://44584718 Differential Revision: https://reviews.llvm.org/D52681 llvm-svn: 343493
* [X86] Create schedule classes for BT(C|R|S)mi and BT(C|R|S)mr instructionsSimon Pilgrim2018-10-0111-49/+71
| | | | llvm-svn: 343490
* [AArch64] Refactor cheap cost modelEvandro Menezes2018-10-011-12/+23
| | | | | | | Refactor the order in `TII::isAsCheapAsAMove()` to ease future development and maintenance. Practically NFC. llvm-svn: 343489
* [X86] Remove unnecessary BTmi/BTmr scheduler overridesSimon Pilgrim2018-10-017-56/+9
| | | | llvm-svn: 343487
* [InstCombine] Handle vector compares in foldGEPIcmp(), take 2Jesper Antonsson2018-10-011-1/+2
| | | | | | | | | | | | | | | | | | | Summary: This is a continuation of the fix for PR34627 "InstCombine assertion at vector gep/icmp folding". (I just realized bugpoint had fuzzed the original test for me, so I had fixed another trigger of the same assert in adjacent code in InstCombine.) This patch avoids optimizing an icmp (to look only at the base pointers) when the resulting icmp would have a different type. The patch adds a testcase and also cleans up and shrinks the pre-existing test for the adjacent assert trigger. Reviewers: lebedev.ri, majnemer, spatel Reviewed By: lebedev.ri Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52494 llvm-svn: 343486
* [X86][Btver2] Fix BTmr schedule uop countsSimon Pilgrim2018-10-011-1/+1
| | | | | | Match AMD Fam16h SOG + llvm-exegesis tests llvm-svn: 343484
* [InstCombine] try to convert vector insert+extract to trunc; 2nd trySanjay Patel2018-10-011-2/+46
| | | | | | | | | | | | | | | | | | | | | | | This was originally committed at rL343407, but reverted at rL343458 because it crashed trying to handle a case where the destination type is FP. This version of the patch adds a check for that possibility. Tests added at rL343480. Original commit message: This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343482
* [X86] Create schedule classes for BTmi and BTmr instructionsSimon Pilgrim2018-10-0111-27/+53
| | | | llvm-svn: 343478
* [LLVM-C] Add an accessor for the kind of a Metadata NodeRobert Widmann2018-10-011-0/+11
| | | | | | | | | | | | | | Summary: Allows for retrieving the type of a metadata node. Has the added benefit of ensuring that the C and C++ kind APIs stay in sync as a failure to add a corresponding LLVMMetadataKind will result in the switch in the accessor being semantically malformed. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52693 llvm-svn: 343469
* [X86][Btver2] Fix masked load scheduleSimon Pilgrim2018-10-011-4/+4
| | | | | | | | JFPU01 resource usage should match JFPX Match AMD Fam16h SOG + llvm-exegesis tests llvm-svn: 343468
* Revert r343407 "[InstCombine] try to convert vector insert+extract to trunc"Hans Wennborg2018-10-011-44/+2
| | | | | | | | | | | | | | | | | | | This caused Chromium builds to fail with "Illegal Trunc" assertion. See https://crbug.com/890723 for repro. > This transform is requested for the backend in: > https://bugs.llvm.org/show_bug.cgi?id=39016 > ...but I figured it was worth doing in IR too, and it's probably > easier to implement here, so that's this patch. > > In the simplest case, we are just truncating a scalar value. If the > extract index doesn't correspond to the LSBs of the scalar, then we > have to shift-right before the truncate. Endian-ness makes this tricky, > but hopefully the ASCII-art helps visualize the transform. > > Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343458
* [AMDGPU] Divergence driven instruction selection. Shift operations.Alexander Timofeev2018-10-013-23/+45
| | | | | | | | | | Summary: This change enables VOP3 shifts to be explicitly selected dependent on the divergence. Differential Revision: https://reviews.llvm.org/D52559 Reviewers: rampitec llvm-svn: 343455
* [X86][BtVer2] Teach how to identify zero-idiom VPERM2F128rr instructions.Andrea Di Biagio2018-10-012-1/+16
| | | | | | | | | | | This patch adds another variant class to identify zero-idiom VPERM2F128rr instructions. On Jaguar, a VPERM wih bit 3 and 7 of the mask set, is a zero-idiom. Differential Revision: https://reviews.llvm.org/D52663 llvm-svn: 343452
* Recommit r343308: [LoopInterchange] Turn into a loop pass.Florian Hahn2018-10-012-46/+16
| | | | llvm-svn: 343450
* [X86][Sched] Update scheduling information for VZEROALL on HWS, BDW, SKX, SNB.Clement Courbet2018-10-014-12/+19
| | | | | | | | | | | | | | | | | Summary: While looking at PR35606, I found out that the scheduling info is incorrect. One can check that it's really a P5+P6 and not a 2*P56 with: echo -e 'vzeroall\nvandps %xmm1, %xmm2, %xmm3' | ./bin/llvm-exegesis -mode=uops -snippets-file=- (vandps executes on P5 only) Reviewers: craig.topper, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52541 llvm-svn: 343447
* [X86][Sched] Add pfm uop counter definitions for SNB,BDW,SKX.Clement Courbet2018-10-011-0/+3
| | | | llvm-svn: 343446
* [DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal.Carlos Alberto Enciso2018-10-013-6/+17
| | | | | | | | When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed. It causes the debugger to display wrong variable value. Differential Revision: https://reviews.llvm.org/D52614 llvm-svn: 343445
* [X86] Stop X86DomainReassignment from creating copies between GR8/GR16 ↵Craig Topper2018-10-011-0/+21
| | | | | | | | | | | | | | physical registers and k-registers. We can only copy between a k-register and a GR32/GR64 register. This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure. This probably isn't the best fix, and we should probably figure out how to handle this correctly. Fixes PR38803. llvm-svn: 343443
* [ORC] Pass Symbols to ExecutionSession::lookup by value, potentially saving aLang Hames2018-10-011-2/+2
| | | | | | copy. llvm-svn: 343442
* [ORC] Add convenience methods for creating DynamicLibraryFallbackGenerators forLang Hames2018-10-011-0/+9
| | | | | | | | libraries on disk, and for the current process. Avoids more boilerplate during JIT construction. llvm-svn: 343430
* [X86] Change an llvm_unreachable to a report_fatal_error so the optimizer ↵Craig Topper2018-09-301-1/+1
| | | | | | | | will stop making us reach the other report_fatal_error in this function. There's a conditional report_fatal_error just above this llvm_unreachable. The optimizer when seeing the unreachable removes the conditional and just makes any other error trigger the existing report_fatal_error. llvm-svn: 343428
* [ORC] Add an 'intern' method to ExecutionEngine for interning symbol names.Lang Hames2018-09-306-11/+10
| | | | | | | This cuts down on boilerplate by reducing 'ES.getSymbolStringPool().intern(...)' to 'ES.intern(...)'. llvm-svn: 343427
* Use the container form llvm::sort(C, ...)Fangrui Song2018-09-305-24/+19
| | | | | | | There are a few leftovers in rL343163 which span two lines. This commit changes these llvm::sort(C.begin(), C.end, ...) to llvm::sort(C, ...) llvm-svn: 343426
* [X86] Fix scheduler class for BTmi instructionsSimon Pilgrim2018-09-301-1/+1
| | | | | | This wasn't treated as a folded load instruction llvm-svn: 343424
* [ORC] Extract and tidy up JITTargetMachineBuilder, add unit test.Lang Hames2018-09-303-39/+56
| | | | | | | | | | | (1) Adds comments for the API. (2) Removes the setArch method: This is redundant: the setArchStr method on the triple should be used instead. (3) Turns EmulatedTLS on by default. This matches EngineBuilder's behavior. llvm-svn: 343423
* [X86] Copy memrefs when folding a load for division instruction selection.Craig Topper2018-09-301-8/+10
| | | | llvm-svn: 343419
* [PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEFBjorn Pettersson2018-09-301-13/+13
| | | | | | | | | | | | | | | | | | | | | Summary: The lowering of PHI nodes used to detect if all inputs originated from IMPLICIT_DEF's. If so the PHI node was replaced by an IMPLICIT_DEF. Now we also consider undef uses when checking the inputs. So if all inputs are implicitly defined or undef we lower the PHI to an IMPLICIT_DEF. This makes PHIElimination::LowerPHINode more consistent as it checks both implicit and undef properties at later stages. Reviewers: MatzeB, tstellar Reviewed By: MatzeB Subscribers: jvesely, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D52558 llvm-svn: 343417
* [PHIElimination] Update the regression test for PR16508Bjorn Pettersson2018-09-301-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When PR16508 was solved (in rL185363) a regression test was added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll. I discovered that the test case no longer reproduced the scenario from PR16508. This problem could have been amended by adding an extra RUN line with "-O1" (or possibly "-O0"), but instead I added a mir-reproducer test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir to get a reproducer that is less sensitive to changes in earlier passes (including O-level). While being at it I also corrected a code comment in PHIElimination::EliminatePHINodes that has been incorrect since the related bugfix from rL185363. Reviewers: MatzeB, hfinkel Reviewed By: MatzeB Subscribers: nemanjai, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52553 llvm-svn: 343416
* [X86][Btver2] Fix PCmpIStrI/PCmpIStrM schedulesSimon Pilgrim2018-09-301-2/+2
| | | | | | | | Missing JFPU0 pipe and double JFPU1 pipe (to match JVALU1) resources Match AMD Fam16h SOG + llvm-exegesis tests llvm-svn: 343413
* [PDB] Add native support for dumping array types.Zachary Turner2018-09-304-0/+75
| | | | llvm-svn: 343412
* [X86][BtVer2] Add the ability to add additional uops for folded instructionsSimon Pilgrim2018-09-301-6/+9
| | | | | | Some instructions take an extra load uop - but not consistently..... llvm-svn: 343410
* [InstCombine] try to convert vector insert+extract to truncSanjay Patel2018-09-301-2/+44
| | | | | | | | | | | | | | | | This transform is requested for the backend in: https://bugs.llvm.org/show_bug.cgi?id=39016 ...but I figured it was worth doing in IR too, and it's probably easier to implement here, so that's this patch. In the simplest case, we are just truncating a scalar value. If the extract index doesn't correspond to the LSBs of the scalar, then we have to shift-right before the truncate. Endian-ness makes this tricky, but hopefully the ASCII-art helps visualize the transform. Differential Revision: https://reviews.llvm.org/D52439 llvm-svn: 343407
* [InstCombine] allow lengthening of insertelement to eliminate shufflesSanjay Patel2018-09-301-2/+8
| | | | | | | | | | | | | As noted in post-commit comments for D52548, the limitation on increasing vector length can be applied by opcode. As a first step, this patch only allows insertelement to be widened because that has no logical downsides for IR and has little risk of pessimizing codegen. This may cause PR39132 to go into hiding during a full compile, but that bug is not fixed. llvm-svn: 343406
* [DAG] Don't perform SINT_TO_FP<->UINT_TO_FP custom conversion after legalizationSimon Pilgrim2018-09-301-4/+4
| | | | | | | | The SINT_TO_FP<->UINT_TO_FP combines for non-negative integers should only occur for legal ops once LegalOperations = true No test case to hand, noticed when investigating PR38226 + PR38970 llvm-svn: 343405
* [X86] Disable BMI BEXTR in X86DAGToDAGISel::matchBEXTRFromAnd unless we're ↵Craig Topper2018-09-303-1/+21
| | | | | | | | | | | | | | | | | | | | | on compiling for a CPU with single uop BEXTR Summary: This function turns (X >> C1) & C2 into a BMI BEXTR or TBM BEXTRI instruction. For BMI BEXTR we have to materialize an immediate into a register to feed to the BEXTR instruction. The BMI BEXTR instruction is 2 uops on Intel CPUs. It looks like on SKL its one port 0/6 uop and one port 1/5 uop. Despite what Agner's tables say. I know one of the uops is a regular shift uop so it would have to go through the port 0/6 shifter unit. So that's the same or worse execution wise than the shift+and which is one 0/6 uop and one 0/1/5/6 uop. The move immediate into register is an additional 0/1/5/6 uop. For now I've limited this transform to AMD CPUs which have a single uop BEXTR. If may also might make sense if we can fold a load or if the and immediate is larger than 32-bits and can't be encoded as a sign extended 32-bit value or if LICM or CSE can hoist the move immediate and share it. But we'd need to look more carefully at that. In the regression I looked at it doesn't look load folding or large immediates were occurring so the regression isn't caused by the loss of those. So we could try to be smarter here if we find a compelling case. Reviewers: RKSimon, spatel, lebedev.ri, andreadb Reviewed By: RKSimon Subscribers: llvm-commits, andreadb, RKSimon Differential Revision: https://reviews.llvm.org/D52570 llvm-svn: 343399
* [ORC] Add partitioning support to CompileOnDemandLayer2.Lang Hames2018-09-294-160/+186
| | | | | | | | | | | | | CompileOnDemandLayer2 now supports user-supplied partition functions (the original CompileOnDemandLayer already supported these). Partition functions are called with the list of requested global values (i.e. global values that currently have queries waiting on them) and have an opportunity to select extra global values to materialize at the same time. Also adds testing infrastructure for the new feature to lli. llvm-svn: 343396
* [ORC] Clear SymbolToDefinitionMap when materializing a MaterializationUnit.Lang Hames2018-09-291-0/+5
| | | | | | | The map is inaccessible at this point, so we may as well reclaim the memory early. llvm-svn: 343395
* [PDB] Better native API support for pointers.Zachary Turner2018-09-291-0/+59
| | | | | | | | | | | | We didn't properly detect when a pointer was a member pointer, and when that was the case we were not properly returning class parent info. This caused member pointers to render incorrectly in pretty mode. However, we didn't even have pretty tests for pointers in native mode, so those are also added now to ensure this. llvm-svn: 343393
* [X86] SimplifyDemandedVectorEltsForTargetNode - remove identity target ↵Simon Pilgrim2018-09-291-18/+18
| | | | | | | | shuffles before simplifying inputs By removing demanded target shuffles that simplify to zero/undef/identity before simplifying its inputs we improve chances of further simplification, as only the immediate parent user of the combined is added back to the work list - this still doesn't help us if its passed through other ops though (bitcasts....). llvm-svn: 343390
* [X86][SSE] LowerScalarImmediateShift - remove 32-bit vXi64 special case ↵Simon Pilgrim2018-09-291-129/+65
| | | | | | | | handling. This is all handled generally by getTargetConstantBitsFromNode now llvm-svn: 343387
* Fix signed/unsigned mismatch warning. NFCI.Simon Pilgrim2018-09-291-1/+1
| | | | llvm-svn: 343385
* [X86] getTargetConstantBitsFromNode - add support for rearranging constant ↵Simon Pilgrim2018-09-291-0/+47
| | | | | | | | bits via shuffles Exposed an issue that recursive calls to getTargetConstantBitsFromNode don't handle changes to EltSizeInBits yet. llvm-svn: 343384
* [X86][SSE] LowerScalarImmediateShift - use getTargetConstantBitsFromNode to ↵Simon Pilgrim2018-09-291-64/+74
| | | | | | | | | | get immediate data Don't just attempt to find a splat build vector. First step towards getting rid of all the 32-bit special case code. llvm-svn: 343383
OpenPOWER on IntegriCloud