summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Vectorcall Calling Convention - Adding CodeGen Complete SupportOren Ben Simhon2016-12-216-66/+269
| | | | | | | | | | | | | The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible. vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above. The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it. This aubmit also includes additional lit tests to cover better HVAs corner cases. Differential Revision: https://reviews.llvm.org/D27392 llvm-svn: 290240
* [LDist] Match behavior between invoking via optimization pipeline or opt ↵Adam Nemet2016-12-212-32/+9
| | | | | | | | | | | | | | | | | | | | | | | | -loop-distribute In r267672, where the loop distribution pragma was introduced, I tried it hard to keep the old behavior for opt: when opt is invoked with -loop-distribute, it should distribute the loop (it's off by default when ran via the optimization pipeline). As MichaelZ has discovered this has the unintended consequence of breaking a very common developer work-flow to reproduce compilations using opt: First you print the pass pipeline of clang with -debug-pass=Arguments and then invoking opt with the returned arguments. clang -debug-pass will include -loop-distribute but the pass is invoked with default=off so nothing happens unless the loop carries the pragma. While through opt (default=on) we will try to distribute all loops. This changes opt's default to off as well to match clang. The tests are modified to explicitly enable the transformation. llvm-svn: 290235
* [APFloat] Remove 'else' after return. NFCTim Shen2016-12-211-13/+15
| | | | | | | | | | Reviewers: kbarton, iteratee, hfinkel, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27934 llvm-svn: 290232
* machine combiner: fix pretty printerSebastian Pop2016-12-212-8/+10
| | | | | | | | | | | we used to print UNKNOWN instructions when the instruction to be printer was not yet inserted in any BB: in that case the pretty printer would not be able to compute a TII as the instruction does not belong to any BB or function yet. This patch explicitly passes the TII to the pretty-printer. Differential Revision: https://reviews.llvm.org/D27645 llvm-svn: 290228
* IPO: Remove the ModuleSummary argument to the FunctionImport pass. NFCI.Peter Collingbourne2016-12-212-34/+15
| | | | | | | | | No existing client is passing a non-null value here. This will come back in a slightly different form as part of the type identifier summary work. Differential Revision: https://reviews.llvm.org/D28006 llvm-svn: 290222
* [Analysis] Centralize objectsize lowering logic.George Burgess IV2016-12-204-28/+40
| | | | | | | | | We're currently doing nearly the same thing for @llvm.objectsize in three different places: two of them are missing checks for overflow, and one of them could subtly break if InstCombine gets much smarter about removing alloc sites. Seems like a good idea to not do that. llvm-svn: 290214
* Move GlobPattern class from LLD to llvm/Support.Rui Ueyama2016-12-202-0/+168
| | | | | | | | | | GlobPattern is a class to handle glob pattern matching. Currently only LLD is using that, but technically that feature is not specific to linkers, so in this patch I move that file to LLVM. Differential Revision: https://reviews.llvm.org/D27969 llvm-svn: 290212
* [SCEV] Be less conservative when extending bitwidths for computing ranges.Michael Zolotukhin2016-12-201-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | Summary: In getRangeForAffineAR we compute ranges for affine exprs E = A + B*C, where ranges for A, B, and C are known. To avoid overflow, we need to operate on a bigger bitwidth, and originally we chose 2*x+1 for this (x being the original bitwidth). However, it is safe to use just 2*x: A+B*C <= (2^x - 1) + (2^x - 1)*(2^x - 1) = = 2^x - 1 + 2^2x - 2^x - 2^x + 1 = = 2^2x - 2^x <= 2^2x - 1 Unnecessary extending of bitwidths results in noticeable slowdowns: ranges perform arithmetic operations using APInt, which are much slower when bitwidths are bigger than 64. Reviewers: sanjoy, majnemer, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27795 llvm-svn: 290211
* Revert "[ObjectYAML] Support for DWARF debug_info section"Chris Bieneman2016-12-202-41/+4
| | | | | | | | | | | This reverts commit r290204. Still breaking bots... In a meeting now, so I can't fix it immediately. Bot URL: http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/2415 llvm-svn: 290209
* [ObjectYAML] Support for DWARF debug_info sectionChris Bieneman2016-12-202-4/+41
| | | | | | | | This patch adds support for YAML<->DWARF for debug_info sections. This re-lands r290147, after fixing the issue that caused bots to fail (thank you UBSan!). llvm-svn: 290204
* IR: Eliminate non-determinism in the module summary analysis.Peter Collingbourne2016-12-203-107/+82
| | | | | | | | | Also make the summary ref and call graph vectors immutable. This means a smaller API surface and fewer places to audit for non-determinism. Differential Revision: https://reviews.llvm.org/D27875 llvm-svn: 290200
* [LoopUnroll] Modify a comment to clarify the usage of TripCount. NFC.Haicheng Wu2016-12-201-8/+8
| | | | | | | | | Make it clear that TripCount is the upper bound of the iteration on which control exits LatchBlock. Differential Revision: https://reviews.llvm.org/D26675 llvm-svn: 290199
* [ARM] Implement isExtractSubvectorCheap.Eli Friedman2016-12-202-0/+12
| | | | | | | | | | | | | | See https://reviews.llvm.org/D6678 for the history of isExtractSubvectorCheap. Essentially the same considerations apply to ARM. This temporarily breaks the formation of vpadd/vpaddl in certain cases; AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles. See https://reviews.llvm.org/D27779 for followup fix. Differential Revision: https://reviews.llvm.org/D27774 llvm-svn: 290198
* Use MaxDepth instead of repeating its valueMatt Arsenault2016-12-201-3/+3
| | | | llvm-svn: 290194
* AMDGPU: Allow 16-bit types in inline asm constraintsMatt Arsenault2016-12-201-0/+2
| | | | llvm-svn: 290193
* AMDGPU: Don't add same instruction multiple times to worklistMatt Arsenault2016-12-201-1/+7
| | | | | | | | | When the instruction is processed the first time, it may be deleted resulting in crashes. While the new test adds the same user to the worklist twice, this particular case doesn't crash but I'm not sure why. llvm-svn: 290191
* Replace std::find_if with llvm::find_if. NFC.George Burgess IV2016-12-201-5/+4
| | | | llvm-svn: 290190
* AMDGPU/SI: Make a function constTom Stellard2016-12-202-4/+3
| | | | llvm-svn: 290185
* AMDGPU/SI: Add a MachineMemOperand when lowering llvm.amdgcn.buffer.load.*Tom Stellard2016-12-206-6/+77
| | | | | | | | | | Reviewers: arsenm, nhaehnle, mareko Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye Differential Revision: https://reviews.llvm.org/D27834 llvm-svn: 290184
* [X86][SSE] Ensure we're only combining shuffles with legal mask types.Simon Pilgrim2016-12-201-0/+4
| | | | | | I haven't managed to get this to fail yet but its technically possible for the AND -> shuffle decomposition to result in illegal types. llvm-svn: 290183
* AMDGPU/SI: Add a MachineMemOperand to MIMG instructionsTom Stellard2016-12-204-6/+57
| | | | | | | | | | | | | | | Summary: Without a MachineMemOperand, the scheduler was assuming MIMG instructions were ordered memory references, so no loads or stores could be reordered across them. Reviewers: arsenm Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27536 llvm-svn: 290179
* Fix build with expensive checks enabledSerge Pavlov2016-12-201-0/+1
| | | | | | | | | Include of llvm/IR/Verifier.h was removed from HexagonCommonGEP.cpp in r289604 as unused. In fact it is required when expensive checks are enabled, because it declared function `verifyFunction`, which is called in conditionally compiled part of the file. llvm-svn: 290170
* [PM] Rework a loop in the CGSCC update logic to be more conservative andChandler Carruth2016-12-201-7/+11
| | | | | | | | | | clear. The current RefSCC can occur in exactly one position so we should just enforce that and leverage the property rather than checking for it anywhere. This addresses review comments made on another patch. llvm-svn: 290162
* [PM] Provide an initial, minimal port of the inliner to the new pass manager.Chandler Carruth2016-12-207-23/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This doesn't implement *every* feature of the existing inliner, but tries to implement the most important ones for building a functional optimization pipeline and beginning to sort out bugs, regressions, and other problems. Notable, but intentional omissions: - No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it. - No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here. - Optimization remarks. Why? Purely to make the patch smaller, no other reason at all. The last one I'll probably work on almost immediately. But I wanted to skip it in the initial patch to try to focus the change as much as possible as there is already a lot of code moving around and both of these *could* be skipped without really disrupting the core logic. A summary of the different things happening here: 1) Adding the usual new PM class and rigging. 2) Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world. 3) Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world. 4) Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles. 5) When inlining removes the last use of a function, eagerly nuking the body of the function so that any "one use remaining" inline cost heuristics are immediately refined, and queuing these functions to be completely deleted once inlining is complete and the call graph updated to reflect that they have become dead. 6) After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the function-local simplifications that are done immediately and internally by the inline utilties. These are the exact same fundamental set of CG updates done by arbitrary function passes. 7) Adding a bunch of test cases to specifically target CGSCC and other subtle aspects in the new PM world. Many thanks to the careful review from Easwaran and Sanjoy and others! Differential Revision: https://reviews.llvm.org/D24226 llvm-svn: 290161
* Reapply r289926: attempt to fix windows buildAdrian Prantl2016-12-201-1/+2
| | | | llvm-svn: 290158
* [IR] Remove the DIExpression field from DIGlobalVariable.Adrian Prantl2016-12-2018-182/+354
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades and a change to the Bitcode record for DIGlobalVariable, that makes upgrading the old format unambiguous also for variables without DIExpressions. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 290153
* Revert "[ObjectYAML] Support for DWARF debug_info section"Chris Bieneman2016-12-202-41/+4
| | | | | | | | This reverts commit r290147. This commit is breaking a bot (http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/621). I don't have time to investigate at the moment, so I'll revert for now. llvm-svn: 290148
* [ObjectYAML] Support for DWARF debug_info sectionChris Bieneman2016-12-202-4/+41
| | | | | | This patch adds support for YAML<->DWARF for debug_info sections. llvm-svn: 290147
* [LV] Sink tripcount query to where it's actually used. NFC.Michael Kuperstein2016-12-191-5/+4
| | | | llvm-svn: 290142
* [ObjectYAML] Support for DWARF Pub SectionsChris Bieneman2016-12-191-0/+30
| | | | | | This patch adds support for YAML<->DWARF round tripping for pub* section data. The patch supports both GNU and non-GNU style entries. llvm-svn: 290139
* [libfuzzer] dump_coverage command line flagMike Aizatsky2016-12-197-0/+28
| | | | | | | | Reviewers: kcc, vitalybuka Differential Revision: https://reviews.llvm.org/D27942 llvm-svn: 290138
* [TargetInstrInfo] replace redundant expression in getMemOpBaseRegImmOfsMichael LeMay2016-12-191-2/+1
| | | | | | | | | | | | | | | Summary: The expression for computing the return value of getMemOpBaseRegImmOfs has only one possible value. The other value would result in a return earlier in the function. This patch replaces the expression with its only possible value. Reviewers: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27437 llvm-svn: 290133
* Make a function to correctly extract the DW_AT_high_pc given the low pc value.Greg Clayton2016-12-191-13/+22
| | | | | | | | DWARF 4 and later supports encoding the PC as an address or as as offset from the low PC. Clients using DWARFDie should be insulated from how to extract the high PC value. This function takes care of extracting the form value and looking for the correct form. Differential Revision: https://reviews.llvm.org/D27885 llvm-svn: 290131
* [InstCombine] use commutative matcher for pattern with commutative operatorsSanjay Patel2016-12-191-3/+5
| | | | | | | | This is a case that was missed in: https://reviews.llvm.org/rL290067 ...and it would regress if we fix operand complexity (PR28296). llvm-svn: 290127
* [InstCombine] add folds for icmp (umin|umax X, Y), XSanjay Patel2016-12-191-19/+54
| | | | | | | | This is a follow-up to: https://reviews.llvm.org/rL289855 (https://reviews.llvm.org/D27531) https://reviews.llvm.org/rL290111 llvm-svn: 290118
* [LoopVersioning] Require loop-simplify form for loop versioning.Florian Hahn2016-12-194-14/+17
| | | | | | | | | | | | | | | | Summary: Requiring loop-simplify form for loop versioning ensures that the runtime check block always dominates the exit block. This patch closes #30958 (https://llvm.org/bugs/show_bug.cgi?id=30958). Reviewers: silviu.baranga, hfinkel, anemet, ashutosh.nema Subscribers: ashutosh.nema, mzolotukhin, efriedma, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D27469 llvm-svn: 290116
* [AMDGPU] When unifying metadata, add operands to named metadata individuallyKonstantin Zhuravlyov2016-12-191-3/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D27725 llvm-svn: 290114
* [InstCombine] add folds for icmp (smax X, Y), XSanjay Patel2016-12-191-17/+35
| | | | | | | This is a follow-up to: https://reviews.llvm.org/rL289855 (D27531) llvm-svn: 290111
* Silence unused warning.Daniel Jasper2016-12-191-0/+1
| | | | llvm-svn: 290109
* [ARM] GlobalISel: Lower i8 and i16 register argsDiana Picus2016-12-191-5/+20
| | | | | | | | | | | This allows lowering i8 and i16 arguments if they can fit in the registers. Note that the lowering is incomplete - ABI extensions are handled in a subsequent patch. (Last part of) Differential Revision: https://reviews.llvm.org/D27704 llvm-svn: 290106
* [ARM] GlobalISel: Allow i8 and i16 addsDiana Picus2016-12-191-1/+6
| | | | | | | | | Teach the instruction selector and legalizer that it's ok to have adds with 8 or 16-bit integers. This is the second part of https://reviews.llvm.org/D27704 llvm-svn: 290105
* [ARM] GlobalISel: Select i8 and i16 copiesDiana Picus2016-12-191-2/+9
| | | | | | | | | Teach the instruction selector that it's ok to copy small values from physical registers. First part of https://reviews.llvm.org/D27704 llvm-svn: 290104
* [Power9] Processor Model for SchedulingEhsan Amiri2016-12-194-3/+1145
| | | | | | | | PWR9 processor model for instruction scheduling. A subsequent patch will migrate PWR9 to Post RA MIScheduler. https://reviews.llvm.org/D24525 llvm-svn: 290102
* [Hexagon] Restore minimum profit check accidentally changed in r290024Malcolm Parsons2016-12-191-2/+2
| | | | llvm-svn: 290100
* [ARM] GlobalISel: Lower more than 4 argumentsDiana Picus2016-12-191-10/+22
| | | | | | | | | | This adds support for lowering more than 4 arguments (although still i32 only). It uses the handleAssignments / ValueHandler infrastructure extracted from the AArch64 backend in r288658. Differential Revision: https://reviews.llvm.org/D27195 llvm-svn: 290098
* AMDGPU: [AMDGPU] Assembler: add .hsa_code_object_metadata directive for ↵Sam Kolton2016-12-194-72/+143
| | | | | | | | | | | | | | | | | | | | | | | | functime metadata V2.0 Summary: Added pair of directives .hsa_code_object_metadata/.end_hsa_code_object_metadata. Between them user can put YAML string that would be directly put to the generated note. E.g.: ''' .hsa_code_object_metadata { amd.MDVersion: [ 2, 0 ] } .end_hsa_code_object_metadata ''' Based on D25046 Reviewers: vpykhtin, nhaustov, yaxunl, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, mgorny, tony-tye Differential Revision: https://reviews.llvm.org/D27619 llvm-svn: 290097
* [ARM] GlobalISel: Support loading from the stackDiana Picus2016-12-193-10/+45
| | | | | | | | | | Add support for selecting simple G_LOAD and G_FRAME_INDEX instructions (32-bit scalars only). This will be useful for functions that need to pass arguments on the stack. First part of https://reviews.llvm.org/D27195. llvm-svn: 290096
* [CodeGen] Make MachineInstr::isIdenticalTo() symmetric.Bjorn Pettersson2016-12-191-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: MachineInstr::isIdenticalTo() is for some reason not symmetric when comparing bundles, which gives us the property: I1->isIdenticalTo(*I2) != I2->isIdenticalTo(*I1) when comparing bundles where one bundle is longer than the other. This patch makes sure that bundles of different length always are considered as not being identical. Thus, the result of the comparison will be the same regardless of which side that happens to be to the left. Reviewers: dexonsmith, jonpa, andrew.w.kaylor Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D27508 llvm-svn: 290095
* [XRay] Fix assertion failure on empty machine basic blocks (PR 31424)Dean Michael Berris2016-12-191-2/+9
| | | | | | | | | | | | | | | | | | | | The original version of the code in XRayInstrumentation.cpp assumed that functions may not have empty machine basic blocks (or that the first one couldn't be). This change addresses that by special-casing that specific situation. We provide two .mir test-cases to make sure we're handling this appropriately. Fixes llvm.org/PR31424. Reviewers: chandlerc Subscribers: varno, llvm-commits Differential Revision: https://reviews.llvm.org/D27913 llvm-svn: 290091
* [X86] When recognizing vector loads or VZEXT_LOAD in selectScalarSSELoad ↵Craig Topper2016-12-191-2/+2
| | | | | | make sure we pass the load's user rather than load itself to the second operand of IsLegalToFold. llvm-svn: 290089
OpenPOWER on IntegriCloud