summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][MS-compatability][llvm] allow MS TYPE/SIZE/LENGTH operators as a part ↵Coby Tayree2017-03-211-45/+60
| | | | | | | | | | | | | | | | of a compound expression This patch introduces X86AsmParser with the ability to handle the aforementioned ops within compound "MS" arithmetical expressions. Currently - only supported as a stand alone Operand, e.g.: "TYPE X" now allowed : "4 + TYPE X * 128" Clang side: https://reviews.llvm.org/D31174 Differential Revision: https://reviews.llvm.org/D31173 llvm-svn: 298425
* [X86] Remove extra semicolon to placate GCC. NFCI.Davide Italiano2017-03-211-1/+1
| | | | llvm-svn: 298423
* [ARM] Recommit the glueless lowering of addc/adde in Thumb1,Artyom Skrobov2017-03-214-39/+299
| | | | | | | | | including the amended (no UB anymore) fix for adding/subtracting -2147483648. This reverts r298328 "[ARM] Revert r297443 and r297820." and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"" llvm-svn: 298417
* Use ProfileSummary:getProfileCount to get ScaledCount for ModuleSummaryDehao Chen2017-03-212-2/+6
| | | | | | | | | | | | | | Summary: ModuleSummary should use the standard interface of ProfileSummary::getProfileCount. Reviewers: eraman, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D31154 llvm-svn: 298404
* Revert 298388 and 298389 because they broke some AMDGPU tests.Adrian Prantl2017-03-213-186/+23
| | | | llvm-svn: 298401
* Recommit r298282 with fixes for memory allocation/deallocationKrzysztof Parzyszek2017-03-212-17/+808
| | | | | | | | | | | [Hexagon] Recognize polynomial-modulo loop idiom again Regain the ability to recognize loops calculating polynomial modulo operation. This ability has been lost due to some changes in the preceding optimizations. Add code to preprocess the IR to a form that the pattern matching code can recognize. llvm-svn: 298400
* Fix RST docs AttributeList heading underlineReid Kleckner2017-03-211-1/+1
| | | | llvm-svn: 298398
* AMDGPU: Buffer descriptor changes for GFX9Marek Olsak2017-03-215-8/+23
| | | | | | | | | | Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31158 llvm-svn: 298397
* AMDGPU: Always use VGPR indexing on GFX9Marek Olsak2017-03-214-3/+8
| | | | | | | | | | Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31157 llvm-svn: 298396
* [Hexagon] Add -march=hexagon to a testcaseKrzysztof Parzyszek2017-03-211-1/+1
| | | | llvm-svn: 298395
* Rename AttributeSet to AttributeListReid Kleckner2017-03-2194-920/+921
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This class is a list of AttributeSetNodes corresponding the function prototype of a call or function declaration. This class used to be called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is typically accessed by parameter and return value index, so "AttributeList" seems like a more intuitive name. Rename AttributeSetImpl to AttributeListImpl to follow suit. It's useful to rename this class so that we can rename AttributeSetNode to AttributeSet later. AttributeSet is the set of attributes that apply to a single function, argument, or return value. Reviewers: sanjoy, javed.absar, chandlerc, pete Reviewed By: pete Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits Differential Revision: https://reviews.llvm.org/D31102 llvm-svn: 298393
* AMDGPU: Fix not including v2i16/v2f16 in register classMatt Arsenault2017-03-211-1/+1
| | | | llvm-svn: 298390
* Don't compose DWARF expressions with multiple subregisters.Adrian Prantl2017-03-212-0/+131
| | | | | | | | | If a register location can only be described by a complex expression (i.e., multiple subregisters) it doesn't safely compose with another complex expression. For example, it is not possible to apply a DW_OP_deref operation to multiple DW_OP_pieces. llvm-svn: 298389
* DwarfExpression: Defer emitting DWARF register operationsAdrian Prantl2017-03-212-23/+55
| | | | | | | | until the rest of the expression is known. This is still an NFC refactoring in preparation of a subsequent bugfix. llvm-svn: 298388
* AMDGPU: Fix asserting on 0 dmask for image intrinsicsMatt Arsenault2017-03-214-0/+409
| | | | | | Fold these to undef during lowering so users get eliminated. llvm-svn: 298387
* AMDGPU: Convert image intrinsic uses in testsMatt Arsenault2017-03-2114-208/+205
| | | | llvm-svn: 298386
* DAG: Fold bitcast/extract_vector_elt of undef to undefMatt Arsenault2017-03-214-8/+40
| | | | | | Fixes not eliminating store when intrinsic is lowered to undef. llvm-svn: 298385
* Fix shufpd test name.Simon Pilgrim2017-03-211-4/+4
| | | | llvm-svn: 298381
* [ARM] [Assembler] Support negative immediates for A32, T32 and T16Sanne Wouda2017-03-2111-33/+320
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: To support negative immediates for certain arithmetic instructions, the instruction is converted to the inverse instruction with a negated (or inverted) immediate. For example, "ADD r0, r1, #FFFFFFFF" cannot be encoded as an ADD instruction. However, "SUB r0, r1, #1" is equivalent. These conversions are different from instruction aliases. An alias maps several assembler instructions onto one encoding. A conversion, however, maps an *invalid* instruction--e.g. with an immediate that cannot be represented in the encoding--to a different (but equivalent) instruction. Several instructions with negative immediates were being converted already, but this was not systematically tested, nor did it cover all instructions. This patch implements all possible substitutions for ARM, Thumb1 and Thumb2 assembler and adds tests. It also adds a feature flag (-mattr=+no-neg-immediates) to turn these substitutions off. This is helpful for users who want their code to assemble to exactly what they wrote. Reviewers: t.p.northover, rovka, samparker, javed.absar, peter.smith, rengolin Reviewed By: javed.absar Subscribers: aadg, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D30571 llvm-svn: 298380
* Test commit accessYi Kong2017-03-211-10/+10
| | | | | | Remove some trailing whitespaces. llvm-svn: 298379
* [InstCombine] auto-generate better checks; NFCSanjay Patel2017-03-212-80/+122
| | | | llvm-svn: 298377
* [x86] use PMOVMSK for vector-sized equality comparisonsSanjay Patel2017-03-212-67/+63
| | | | | | | | | | We could do better by splitting any oversized type into whatever vector size the target supports, but I left that for future work if it ever comes up. The motivating case is memcmp() calls on 16-byte structs, so I think we can wire that up with a TLI hook that feeds into this. Differential Revision: https://reviews.llvm.org/D31156 llvm-svn: 298376
* [X86][AVX] Tests showing missing SHUFPD + ZERO loweringSimon Pilgrim2017-03-212-0/+99
| | | | | | This lowers to SHUFPD if the input is zeroinitializer but not with a demanded elts optimized build vector. llvm-svn: 298370
* [AMDGPU] Iterative scheduling infrastructure + minimal registry schedulerValery Pykhtin2017-03-2112-3/+1764
| | | | | | Differential revision: https://reviews.llvm.org/D31046 llvm-svn: 298368
* [GlobalISel] Fix shufflevector testsVolkan Keles2017-03-212-42/+42
| | | | | | | | clang-lld-x86_64-2stage fails because of the order of the instructions. `CHECK-DAG` directives should fix the problem. llvm-svn: 298367
* [ADMGPU] SDWA peephole optimization pass.Sam Kolton2017-03-218-1/+1092
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: First iteration of SDWA peephole. This pass tries to combine several instruction into one SDWA instruction. E.g. it converts: ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1 V_ADD_I32_e32 %vreg2, %vreg0, %vreg3 V_LSHLREV_B32_e32 %vreg4, 16, %vreg2 ''' Into: ''' V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD ''' Pass structure: 1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''. 2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0''' 3. Iterate over all potential instructions and check if they can be converted into SDWA. 4. Convert instructions to SDWA. This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done). There are several ways this pass can be improved: 1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass. 2. Introduce more SDWA patterns 3. Introduce mnemonics to limit when SDWA patterns should apply Reviewers: vpykhtin, alex-t, arsenm, rampitec Subscribers: wdng, nhaehnle, mgorny Differential Revision: https://reviews.llvm.org/D30038 llvm-svn: 298365
* [DebugInfo][X86] Teach Optimize LEAs pass to handle debug valuesAndrea Di Biagio2017-03-212-7/+133
| | | | | | | | | | | | | | | | This patch fixes an issue in the Optimize LEAs pass where redundant LEAs were not removed because they were being used by debug values. The debug values are now ignored when determining whether LEAs are redundant. For now the debug values for the redundant LEAs are marked as undefined, effectively lost. The intention is for a follow up patch which will attempt to preserve the debug values where possible. Patch by Andrew Ng. Differential Revision: https://reviews.llvm.org/D30835 llvm-svn: 298360
* NFC. InstCombiner::visitFAdd extract LHSIntVal/RHSIntVal local variablesArtur Pilipenko2017-03-211-9/+11
| | | | llvm-svn: 298359
* [GlobalISel] Move isTriviallyDead to Utils. NFC.Volkan Keles2017-03-213-23/+25
| | | | | | Make it accessible by the targets to avoid code duplication. llvm-svn: 298358
* [DAGTypeLegalizer] Handle widening truncate to vector of i1.Jonas Paulsson2017-03-212-1/+58
| | | | | | | | | | | | Previously, PromoteIntRes_TRUNCATE() did not handle the case where the operand needs widening, which resulted in llvm_unreachable(). This patch adds the needed handling, along with a test case. Review: Eli Friedman, Simon Pilgrim. https://reviews.llvm.org/D31077 llvm-svn: 298357
* [ConstantFolding] Fix to prevent constant folding having to repeatedly scan ↵David Green2017-03-212-1/+74
| | | | | | | | | | | | | operands. NFCI After the loop unroll threshold was increased in r295538, very large constant expressions can be created. This prevents them from having to be recursively scanned, leading to a compile time blow-up. Differential Revision: https://reviews.llvm.org/D30689 llvm-svn: 298356
* [GlobalISel] Translate shufflevectorVolkan Keles2017-03-216-4/+211
| | | | | | | | | | | | Reviewers: qcolombet, aditya_nandakumar, t.p.northover, javed.absar, ab, dsanders Reviewed By: javed.absar Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30962 llvm-svn: 298347
* [APFloat] Tag the fltSemantic getter functions with LLVM_READNONE.Craig Topper2017-03-211-7/+7
| | | | | | This gives about an 8k reduction in the size of the opt binary on my local x86-64 build. llvm-svn: 298344
* [APInt] Add LLVM_READONLY to some methods.Craig Topper2017-03-211-9/+9
| | | | llvm-svn: 298342
* [SystemZ] Don't drop MO flags in foldMemoryOperandImpl()Jonas Paulsson2017-03-212-5/+134
| | | | | | | | | | The def operand of the new LG/LD should have the old def operands flags and subreg index. New test: test/CodeGen/SystemZ/fold-memory-op-impl.ll Review: Ulrich Weigand llvm-svn: 298341
* Fix evaluation of LLVM_DEFINITIONSSerge Pavlov2017-03-214-16/+26
| | | | | | | | | | | | | | | | CMake variable LLVM_DEFINITIONS collects preprocessor definitions provided for host compiler that builds llvm components. A function add_llvm_definitions was introduced in AddLLVMDefinitions.cmake to keep track of these definitions and was intended to be a replacement for CMake command add_definitions. Actually in many cases add_definitions is still used and the content of LLVM_DEFINITIONS is not actual now. On the other hand the current version of CMake allows getting set of definitions in a more convenient way. This fix implements evaluation of the variable by reading corresponding cmake property. Differential Revision: https://reviews.llvm.org/D31125 llvm-svn: 298336
* Revert "[Hexagon] Recognize polynomial-modulo loop idiom again"Vitaly Buka2017-03-212-784/+17
| | | | | | | | Fix memory leaks on check-llvm tests detected by Asan. This reverts commit r298282. llvm-svn: 298329
* [ARM] Revert r297443 and r297820.Eli Friedman2017-03-214-287/+39
| | | | | | | | | | | | The glueless lowering of addc/adde in Thumb1 has known serious miscompiles (see https://reviews.llvm.org/D31081), and r297820 causes an infinite loop for certain constructs. It's not clear when they will be fixed, so let's just take them out of the tree for now. (I resolved a small conflict with r297453.) llvm-svn: 298328
* [Support] Fill the file_status struct with link count.Zachary Turner2017-03-203-16/+29
| | | | | | Differential Revision: https://reviews.llvm.org/D31110 llvm-svn: 298326
* Add a function to MD5 a file's contents.Zachary Turner2017-03-208-22/+111
| | | | | | | | | | | | | | | In doing so, clean up the MD5 interface a little. Most existing users only care about the lower 8 bytes of an MD5, but for some users that care about the upper and lower, there wasn't a good interface. Furthermore, consumers of the MD5 checksum were required to handle endianness details on their own, so it seems reasonable to abstract this into a nicer interface that just gives you the right value. Differential Revision: https://reviews.llvm.org/D31105 llvm-svn: 298322
* [ARM] Fix PR32130: Handle promotion of zero sized constants.Vadzim Dambrouski2017-03-202-1/+11
| | | | | | | | | | | The special case of zero sized values was previously not handled correctly. This patch handles this by not promoting if the size is zero. Patch by Tim Neumann. Differential Revision: https://reviews.llvm.org/D31116 llvm-svn: 298320
* [x86] add tests for setcc of i128/i256; NFCSanjay Patel2017-03-201-0/+188
| | | | llvm-svn: 298317
* InstCombine: Check source value precision when reducing cast intrinsicMatt Arsenault2017-03-202-38/+419
| | | | | | Missed this check when porting from the libcall version. llvm-svn: 298312
* GlobalISel: add implicit defs & uses when mutating an instruction.Tim Northover2017-03-202-3/+20
| | | | | | Otherwise a scheduler might do bad things to the code we produce. llvm-svn: 298311
* Replace uses of DwarfExpression::addMachineReg* with addMachineRegExpressionAdrian Prantl2017-03-204-81/+96
| | | | | | | | | | | | | and mark the methods as protected. Besides reducing the surface area of DwarfExpression, this is in preparation for an upcoming bugfix in the DwarfExpression implementation, for which it will be necessary to defer emitting register operations until the rest of the expression is known. NFC llvm-svn: 298309
* Make implementation details in DwarfExpression protected. (NFC)Adrian Prantl2017-03-201-13/+12
| | | | llvm-svn: 298308
* [Fuchsia] Use %gs for ABI slots under -mcmodel=kernelEvgeniy Stepanov2017-03-201-2/+2
| | | | | | | | | | | Make x86_64-fuchsia targets under -mcmodel=kernel use %gs rather than %fs to access ABI slots for stack-protector and safe-stack Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D30870 llvm-svn: 298302
* [SCEV] Fix trip multiple calculationEli Friedman2017-03-202-10/+141
| | | | | | | | | | | | | | | | | | | | If loop bound containing calculations like min(a,b), the Scalar Evolution API getSmallConstantTripMultiple returns 4294967295 "-1" as the trip multiple. The problem is that, SCEV use -1 * umax to represent umin. The multiple constant -1 was returned, and the logic of guarding against huge trip counts was skipped. Because -1 has 32 active bits. The fix attempt to factor more general cases. First try to get the greatest power of two divisor of trip count expression. In case overflow happens, the trip count expression is still divisible by the greatest power of two divisor returned. Returns 1 if not divisible by 2. Patch by Huihui Zhang <huihuiz@codeaurora.org> Differential Revision: https://reviews.llvm.org/D30840 llvm-svn: 298301
* [X86] Clean up test/CodeGen/X86/2006-03-01-InstrSchedBug.llDavid L. Jones2017-03-201-8/+20
| | | | | | | | | | | | | | | | | Summary: - Migrated from grep to FileCheck. - Re-indented, removed boilerplate comments. - Added 'entry' label at beginning of basic block. Patch by Jorge Gorbe! Reviewed By: RKSimon Subscribers: RKSimon, jgorbe, llvm-commits Differential Revision: https://reviews.llvm.org/D30317 llvm-svn: 298298
* Explicitly add move constructor/assignment operators.Zachary Turner2017-03-201-0/+2
| | | | | | | | | | These are needed due to some obscure rules in the standard about how std::vector selects between copy and move constructors, which can cause a conforming implementation to attempt to select the copy constructor of RuleMatcher, which will fail since std::unique_ptr<> isn't copyable. llvm-svn: 298294
OpenPOWER on IntegriCloud