summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [LLParser] Parse vector GEP constant expression correctlyMichael Kuperstein2016-12-212-0/+17
| | | | | | | | | | | The constantexpr parsing was too constrained and rejected legal vector GEPs. This relaxes it to be similar to the ones for instruction parsing. This fixes PR30816. Differential Revision: https://reviews.llvm.org/D28013 llvm-svn: 290261
* [ConstantFolding] Fix vector GEPs harderMichael Kuperstein2016-12-211-0/+21
| | | | | | | | | | For vector GEPs, CastGEPIndices can end up in an infinite recursion, because we compare the vector type to the scalar pointer type, find them different, and then try to cast a type to itself. Differential Revision: https://reviews.llvm.org/D28009 llvm-svn: 290260
* Added a template for building target specific memory node in DAG.Elena Demikhovsky2016-12-214-44/+42
| | | | | | | | | | I added API for creation a target specific memory node in DAG. Today, all memory nodes are common for all targets and their constructors are located in SelectionDAG.cpp. There are some cases in X86 where we need to create a special node - truncation-with-saturation store, float-to-half-store. In the current patch I added truncation-with-saturation nodes and I'm using them for intrinsics. In the future I plan to implement DAG lowering for truncation-with-saturation pattern. Differential Revision: https://reviews.llvm.org/D27899 llvm-svn: 290250
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-1/+1
| | | | | | Fixing failing test. llvm-svn: 290246
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-211-6/+136
| | | | | | | | | | | | | The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible. vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above. The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it. This aubmit also includes additional lit tests to cover better HVAs corner cases. Differential Revision: https://reviews.llvm.org/D27392 llvm-svn: 290240
* [LDist] Match behavior between invoking via optimization pipeline or opt ↵Adam Nemet2016-12-2115-25/+25
| | | | | | | | | | | | | | | | | | | | | | | | -loop-distribute In r267672, where the loop distribution pragma was introduced, I tried it hard to keep the old behavior for opt: when opt is invoked with -loop-distribute, it should distribute the loop (it's off by default when ran via the optimization pipeline). As MichaelZ has discovered this has the unintended consequence of breaking a very common developer work-flow to reproduce compilations using opt: First you print the pass pipeline of clang with -debug-pass=Arguments and then invoking opt with the returned arguments. clang -debug-pass will include -loop-distribute but the pass is invoked with default=off so nothing happens unless the loop carries the pragma. While through opt (default=on) we will try to distribute all loops. This changes opt's default to off as well to match clang. The tests are modified to explicitly enable the transformation. llvm-svn: 290235
* remove pretty-print test that requires debugSebastian Pop2016-12-211-5/+0
| | | | | | | There is no need to test the pretty printer. Remove the boggus test to make the build bots happy. llvm-svn: 290234
* machine combiner: fix pretty printerSebastian Pop2016-12-211-0/+5
| | | | | | | | | | | we used to print UNKNOWN instructions when the instruction to be printer was not yet inserted in any BB: in that case the pretty printer would not be able to compute a TII as the instruction does not belong to any BB or function yet. This patch explicitly passes the TII to the pretty-printer. Differential Revision: https://reviews.llvm.org/D27645 llvm-svn: 290228
* [Analysis] Centralize objectsize lowering logic.George Burgess IV2016-12-201-1/+34
| | | | | | | | | We're currently doing nearly the same thing for @llvm.objectsize in three different places: two of them are missing checks for overflow, and one of them could subtly break if InstCombine gets much smarter about removing alloc sites. Seems like a good idea to not do that. llvm-svn: 290214
* Revert "[ObjectYAML] Support for DWARF debug_info section"Chris Bieneman2016-12-201-525/+0
| | | | | | | | | | | This reverts commit r290204. Still breaking bots... In a meeting now, so I can't fix it immediately. Bot URL: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/2415 llvm-svn: 290209
* [ObjectYAML] Support for DWARF debug_info sectionChris Bieneman2016-12-201-0/+525
| | | | | | | | This patch adds support for YAML<->DWARF for debug_info sections. This re-lands r290147, after fixing the issue that caused bots to fail (thank you UBSan!). llvm-svn: 290204
* IR: Eliminate non-determinism in the module summary analysis.Peter Collingbourne2016-12-201-2/+2
| | | | | | | | | Also make the summary ref and call graph vectors immutable. This means a smaller API surface and fewer places to audit for non-determinism. Differential Revision: https://reviews.llvm.org/D27875 llvm-svn: 290200
* [ARM] Implement isExtractSubvectorCheap.Eli Friedman2016-12-204-51/+72
| | | | | | | | | | | | | | See https://reviews.llvm.org/D6678 for the history of isExtractSubvectorCheap. Essentially the same considerations apply to ARM. This temporarily breaks the formation of vpadd/vpaddl in certain cases; AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles. See https://reviews.llvm.org/D27779 for followup fix. Differential Revision: https://reviews.llvm.org/D27774 llvm-svn: 290198
* [ARM] Generate checks for shuffle tests using update_llc_test_checks.py.Eli Friedman2016-12-203-143/+542
| | | | llvm-svn: 290196
* AMDGPU: Allow 16-bit types in inline asm constraintsMatt Arsenault2016-12-201-0/+41
| | | | llvm-svn: 290193
* AMDGPU: Run fp combine tests on VIMatt Arsenault2016-12-203-135/+171
| | | | llvm-svn: 290192
* AMDGPU: Don't add same instruction multiple times to worklistMatt Arsenault2016-12-201-0/+14
| | | | | | | | | When the instruction is processed the first time, it may be deleted resulting in crashes. While the new test adds the same user to the worklist twice, this particular case doesn't crash but I'm not sure why. llvm-svn: 290191
* AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*Tom Stellard2016-12-202-3/+17
| | | | | | | | | | Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184
* AMDGPU/SI: Add a MachineMemOperand to MIMG instructionsTom Stellard2016-12-201-1/+14
| | | | | | | | | | | | | | | Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27536 llvm-svn: 290179
* [PM] Provide an initial, minimal port of the inliner to the new pass manager.Chandler Carruth2016-12-204-0/+416
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't implement *every* feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these *could* be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-20166-418/+571
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades and a change to the Bitcode record for DIGlobalVariable, that makes upgrading the old format unambiguous also for variables without DIExpressions. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 290153
* Revert "[ObjectYAML] Support for DWARF debug_info section"Chris Bieneman2016-12-201-525/+0
| | | | | | | | This reverts commit r290147. This commit is breaking a bot (http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/621). I don't have time to investigate at the moment, so I'll revert for now. llvm-svn: 290148
* [ObjectYAML] Support for DWARF debug_info sectionChris Bieneman2016-12-201-0/+525
| | | | | | This patch adds support for YAML<->DWARF for debug_info sections. llvm-svn: 290147
* Add ARM support to update_llc_test_checks.pyEli Friedman2016-12-191-34/+64
| | | | | | | | | | Just the minimal support to get it working at the moment. Includes checks for test/CodeGen/ARM/vzip.ll as an example. Differential Revision: https://reviews.llvm.org/D27829 llvm-svn: 290144
* [ObjectYAML] Support for DWARF Pub SectionsChris Bieneman2016-12-191-0/+355
| | | | | | This patch adds support for YAML<->DWARF round tripping for pub* section data. The patch supports both GNU and non-GNU style entries. llvm-svn: 290139
* [InstCombine] use commutative matcher for pattern with commutative operatorsSanjay Patel2016-12-191-1/+16
| | | | | | | | This is a case that was missed in: https://reviews.llvm.org/rL290067 ...and it would regress if we fix operand complexity (PR28296). llvm-svn: 290127
* [InstCombine] add folds for icmp (umin|umax X, Y), XSanjay Patel2016-12-192-73/+25
| | | | | | | | This is a follow-up to: https://reviews.llvm.org/rL289855 (https://reviews.llvm.org/D27531) https://reviews.llvm.org/rL290111 llvm-svn: 290118
* [LoopVersioning] Require loop-simplify form for loop versioning.Florian Hahn2016-12-195-10/+53
| | | | | | | | | | | | | | | | Summary: Requiring loop-simplify form for loop versioning ensures that the runtime check block always dominates the exit block. This patch closes #30958 (https://llvm.org/bugs/show_bug.cgi?id=30958). Reviewers: silviu.baranga, hfinkel, anemet, ashutosh.nema Subscribers: ashutosh.nema, mzolotukhin, efriedma, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D27469 llvm-svn: 290116
* [AMDGPU] When unifying metadata, add operands to named metadata individuallyKonstantin Zhuravlyov2016-12-191-6/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D27725 llvm-svn: 290114
* [InstCombine] add folds for icmp (smax X, Y), XSanjay Patel2016-12-191-36/+12
| | | | | | | This is a follow-up to: https://reviews.llvm.org/rL289855 (D27531) llvm-svn: 290111
* [ARM] GlobalISel: Add more checks to testDiana Picus2016-12-191-0/+4
| | | | llvm-svn: 290108
* [ARM] GlobalISel: Minor style fixup in testDiana Picus2016-12-191-3/+3
| | | | llvm-svn: 290107
* [ARM] GlobalISel: Lower i8 and i16 register argsDiana Picus2016-12-192-8/+52
| | | | | | | | | | | This allows lowering i8 and i16 arguments if they can fit in the registers. Note that the lowering is incomplete - ABI extensions are handled in a subsequent patch. (Last part of) Differential Revision: https://reviews.llvm.org/D27704 llvm-svn: 290106
* [ARM] GlobalISel: Allow i8 and i16 addsDiana Picus2016-12-193-5/+122
| | | | | | | | | Teach the instruction selector and legalizer that it's ok to have adds with 8 or 16-bit integers. This is the second part of https://reviews.llvm.org/D27704 llvm-svn: 290105
* [ARM] GlobalISel: Select i8 and i16 copiesDiana Picus2016-12-191-3/+60
| | | | | | | | | Teach the instruction selector that it's ok to copy small values from physical registers. First part of https://reviews.llvm.org/D27704 llvm-svn: 290104
* [ARM] GlobalISel: Lower more than 4 argumentsDiana Picus2016-12-192-0/+28
| | | | | | | | | | This adds support for lowering more than 4 arguments (although still i32 only). It uses the handleAssignments / ValueHandler infrastructure extracted from the AArch64 backend in r288658. Differential Revision: https://reviews.llvm.org/D27195 llvm-svn: 290098
* AMDGPU: [AMDGPU] Assembler: add .hsa_code_object_metadata directive for ↵Sam Kolton2016-12-192-1/+57
| | | | | | | | | | | | | | | | | | | | | | | | functime metadata V2.0 Summary: Added pair of directives .hsa_code_object_metadata/.end_hsa_code_object_metadata. Between them user can put YAML string that would be directly put to the generated note. E.g.: ''' .hsa_code_object_metadata { amd.MDVersion: [ 2, 0 ] } .end_hsa_code_object_metadata ''' Based on D25046 Reviewers: vpykhtin, nhaustov, yaxunl, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, mgorny, tony-tye Differential Revision: https://reviews.llvm.org/D27619 llvm-svn: 290097
* [ARM] GlobalISel: Support loading from the stackDiana Picus2016-12-192-0/+62
| | | | | | | | | | Add support for selecting simple G_LOAD and G_FRAME_INDEX instructions (32-bit scalars only). This will be useful for functions that need to pass arguments on the stack. First part of https://reviews.llvm.org/D27195. llvm-svn: 290096
* [XRay] Fix assertion failure on empty machine basic blocks (PR 31424)Dean Michael Berris2016-12-192-0/+36
| | | | | | | | | | | | | | | | | | | | The original version of the code in XRayInstrumentation.cpp assumed that functions may not have empty machine basic blocks (or that the first one couldn't be). This change addresses that by special-casing that specific situation. We provide two .mir test-cases to make sure we're handling this appropriately. Fixes llvm.org/PR31424. Reviewers: chandlerc Subscribers: varno, llvm-commits Differential Revision: https://reviews.llvm.org/D27913 llvm-svn: 290091
* Add files I seem to have dropped in my revert (r290086).Daniel Jasper2016-12-191-0/+22
| | | | | | Sorry! llvm-svn: 290087
* Revert @llvm.assume with operator bundles (r289755-r289757)Daniel Jasper2016-12-1912-42/+42
| | | | | | | This creates non-linear behavior in the inliner (see more details in r289755's commit thread). llvm-svn: 290086
* [FileCheck] Fix --strict-whitespace --match-full-lines -- add test-caseTom de Vries2016-12-181-0/+14
| | | | | | | Add test-case that was missing in "[FileCheck] Fix --strict-whitespace --match-full-lines" commit. llvm-svn: 290070
* [InstCombine] use commutative matchers for patterns with commutative operatorsSanjay Patel2016-12-183-51/+13
| | | | | | | | | | | | | | | | | | | | | | | | Background/motivation - I was circling back around to: https://llvm.org/bugs/show_bug.cgi?id=28296 I made a simple patch for that and noticed some regressions, so added test cases for those with rL281055, and this is hopefully the minimal fix for just those cases. But as you can see from the surrounding untouched folds, we are missing commuted patterns all over the place, and of course there are no regression tests to cover any of those cases. We could sprinkle "m_c_" dust all over this file and catch most of the missing folds, but then we still wouldn't have test coverage, and we'd still miss some fraction of commuted patterns because they require adjustments to the match order. I'm aware of the concern about the potential compile-time performance impact of adding matches like this (currently being discussed on llvm-dev), but I don't think there's any evidence yet to suggest that handling commutative pattern matching more thoroughly is not a worthwhile goal of InstCombine. Differential Revision: https://reviews.llvm.org/D24419 llvm-svn: 290067
* Revert r289955 and r289962. This is causing lots of ASAN failures for us.Daniel Jasper2016-12-181-41/+0
| | | | | | | | Not sure whether it causes and ASAN false positive or whether it actually leads to incorrect code or whether it even exposes bad code. Hans, I'll get you instructions to reproduce this. llvm-svn: 290066
* [X86] [AVX512] Minor fix in encoding of scalar EVEX instructions. NFC.Michael Zuckerman2016-12-182-36/+36
| | | | | | | | | | | | Commit on behalf of Gadi Haber Removed EVEX_V512 prefix from scalar EVEX instructions since HW ignores L'L bits anyway (LIG). 4 instructions are modified. The changed encodings are validated with XED. Rviewers: delena, igorb Differential revision: https://reviews.llvm.org/D27802 llvm-svn: 290065
* [X86][SSE] Add support for combining target shuffles to SHUFPS.Simon Pilgrim2016-12-1811-113/+93
| | | | | | As discussed on D27692, the next step will be to allow cross-domain shuffles once the combined shuffle depth passes a certain point. llvm-svn: 290064
* [X86][SSE][AVX-512] Convert FAND/FOR/FXOR/FANDN nodes to integer operations ↵Craig Topper2016-12-184-45/+58
| | | | | | | | | | | | if they are available. This will allow a bunch of patterns to be removed. These nodes are only emitted for lowering FABS/FNEG/FNABS/FCOPYSIGN. Ideally we just wouldn't create these nodes if SSE2 or higher is available, but it was simple to just convert them in DAG combine. For SSE2, AVX, and AVX512 with DQI this is no functional change as the execution domain fixing pass ensures the right domain is selected regardless of the ISD opcode. For AVX-512 without DQI we end up using integer instructions since the floating point versions aren't available. But we were already doing that for any logical operations in code that didn't come from FABS/FNEG/FNABS/FCOPYSIGN so this seems no worse. And we get the benefit of being able to fold broadcasts now. llvm-svn: 290060
* [AVX-512] Use EVEX encoded XOR instruction for zeroing scalar registers when ↵Craig Topper2016-12-181-1/+25
| | | | | | | | DQI and VLX instructions are available. This can give the register allocator more registers to use. llvm-svn: 290057
* [AVX-512] Make sure VLX is also enabled before using EVEX encoded logic ops ↵Craig Topper2016-12-181-1/+1
| | | | | | for scalars. I missed this in r290049. llvm-svn: 290055
* AMDGPU: Fix broken check prefix in testMatt Arsenault2016-12-171-10/+7
| | | | llvm-svn: 290050
OpenPOWER on IntegriCloud