summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* add skylakeClement Courbet2017-04-211-2/+3
| | | | llvm-svn: 300962
* add 32 bit testsClement Courbet2017-04-211-8/+10
| | | | llvm-svn: 300961
* use repmovsb when optimizing forminsizeClement Courbet2017-04-211-0/+26
| | | | llvm-svn: 300960
* Rename FastString flag.Clement Courbet2017-04-211-2/+2
| | | | llvm-svn: 300959
* add more testsClement Courbet2017-04-211-0/+4
| | | | llvm-svn: 300958
* X86 memcpy: use REPMOVSB instead of REPMOVS{Q,D,W} for inline copiesClement Courbet2017-04-211-0/+15
| | | | | | | | | | | | when the subtarget has fast strings. This has two advantages: - Speed is improved. For example, on Haswell thoughput improvements increase linearly with size from 256 to 512 bytes, after which they plateau: (e.g. 1% for 260 bytes, 25% for 400 bytes, 40% for 508 bytes). - Code is much smaller (no need to handle boundaries). llvm-svn: 300957
* [Thumb1] The recently added tADCS and tSBCS pseudo-instructions were missing ↵Artyom Skrobov2017-04-211-0/+31
| | | | | | | | | | | | | | `Uses = [CPSR]` Summary: Thanks to Oliver Stannard for helping catch this. Reviewers: olista01, efriedma Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D31815 llvm-svn: 300951
* [PartialInliner] Fix crash when inlining functions with unreachable blocks.Davide Italiano2017-04-211-0/+38
| | | | | | | | | | | | | | | | CodeExtractor looks up the dominator node corresponding to return blocks when splitting them. If one of these blocks is unreachable, there's no node in the Dom and CodeExtractor crashes because it doesn't check for domtree node validity. In theory, we could add just a check for skipping null DTNodes in `splitReturnBlock` but the fix I propose here is slightly different. To the best of my knowledge, unreachable blocks are irrelevant for the algorithm, therefore we can just skip them when building the candidate set in the constructor. Differential Revision: https://reviews.llvm.org/D32335 llvm-svn: 300946
* Revert r300932 and r300930.Akira Hatanaka2017-04-211-64/+0
| | | | | | | | | It seems that r300930 was creating an infinite loop in dag-combine when compling the following file: MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c llvm-svn: 300940
* [AArch64] Improve code generation for logical instructions takingAkira Hatanaka2017-04-211-0/+64
| | | | | | | | | | | | | | | | | | | | immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300913, which broke bots because I didn't fix a call to ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of TargetLoweringOpt and TargetLowering. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300930
* Revert r300746 (SCEV analysis for or instructions).Eli Friedman2017-04-201-38/+0
| | | | | | | | There have been multiple reports of this causing problems: a compile-time explosion on the LLVM testsuite, and a stack overflow for an opencl kernel. llvm-svn: 300928
* Revert "[AArch64] Improve code generation for logical instructions taking"Akira Hatanaka2017-04-201-64/+0
| | | | | | | | This reverts r300913. This broke bots. llvm-svn: 300916
* [Simplify] Add testcase to show that merging conditional stores for ↵Craig Topper2017-04-201-0/+39
| | | | | | triangles is sensitive to the order of the branch targets on the conditional branches. NFC llvm-svn: 300915
* [AArch64] Improve code generation for logical instructions takingAkira Hatanaka2017-04-201-0/+64
| | | | | | | | | | | | | | | | immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300913
* [InstCombine] allow shl+shr demanded bits folds with splat constantsSanjay Patel2017-04-201-6/+4
| | | | llvm-svn: 300911
* [InstCombine] add tests for shl+shr demanded bits splat vector folds; NFCSanjay Patel2017-04-201-2/+24
| | | | llvm-svn: 300907
* ARM: lower "fence singlethread" to a pure compiler barrier.Tim Northover2017-04-201-0/+16
| | | | | | | | Single-threaded fences aren't required to provide any synchronization with other processing elements so there's no need for a DMB. They should still be a barrier for compiler optimizations though. llvm-svn: 300904
* [InstCombine] allow shl demanded bits folds with splat constantsSanjay Patel2017-04-203-12/+5
| | | | | | More fixes are needed to enable the helper SimplifyShrShlDemandedBits(). llvm-svn: 300898
* [InstCombine] allow ashr/lshr demanded bits folds with splat constantsSanjay Patel2017-04-202-7/+5
| | | | llvm-svn: 300888
* [InstCombine] add tests for demanded bits ashr/lshr splat constants; NFCSanjay Patel2017-04-201-0/+22
| | | | llvm-svn: 300884
* Don't emit locations that need a DW_OP_stack_value in DWARF 2 & 3.Adrian Prantl2017-04-201-0/+9
| | | | | | https://bugs.llvm.org/show_bug.cgi?id=32382 llvm-svn: 300883
* ARM: handle post-indexed NEON ops where the offset isn't the access width.Tim Northover2017-04-208-68/+124
| | | | | | | | | | | Before, we assumed that any ConstantInt offset was precisely the access width, so we could use the "[rN]!" form. ISelLowering only ever created that kind, but further simplification during combining could lead to unexpected constants and incorrect codegen. Should fix PR32658. llvm-svn: 300878
* [DWARF] Versioning for DWARF constants; verify FORMsPaul Robinson2017-04-202-162/+0
| | | | | | | | | | | | | Associate the version-when-defined with definitions of standard DWARF constants. Identify the "vendor" for DWARF extensions. Use this information to verify FORMs in .debug_abbrev are defined as of the DWARF version specified in the associated unit. Removed two tests that had specified DWARF v1 (which essentially does not exist). Differential Revision: http://reviews.llvm.org/D30785 llvm-svn: 300875
* CodeGen: Let frame index value type match alloca addr spaceYaxun Liu2017-04-201-0/+55
| | | | | | | | | | | | | | | | | | | | | | Recently alloca address space has been added to data layout. Due to this change, pointer returned by alloca may have different size as pointer in address space 0. However, currently the value type of frame index is assumed to be of the same size as pointer in address space 0. This patch fixes that. Most targets assume alloca returning pointer in address space 0, which is the default alloca address space. Therefore it is NFC for them. AMDGCN target with amdgiz environment requires this change since it assumes alloca returning pointer to addr space 5 and its size is 32, which is different from the size of pointer in addr space 0 which is 64. Differential Revision: https://reviews.llvm.org/D32021 llvm-svn: 300864
* [MVT][SVE] Scalable vector MVTs (2/3)Amara Emerson2017-04-202-2/+2
| | | | | | | | | | | Adds scalable vector machine value types, and updates the switch statements required for tablegen. Patch by Graham Hunter. Differential Revision: https://reviews.llvm.org/D32018 llvm-svn: 300840
* [mips][msa] Mask vectors holding shift amountsPetar Jovanovic2017-04-202-0/+631
| | | | | | | | | | | | | | | | | | | | | | | | | | | Masked vectors which hold shift amounts when creating the following nodes: ISD::SHL, ISD::SRL or ISD::SRA. Instructions that use said nodes, which have had their arguments altered are sll, srl, sra, bneg, bclr and bset. For said instructions, the shift amount or the bit position that is specified in the corresponding vector elements will be interpreted as the shift amount/bit position modulo the size of the element in bits. The problem lies in compiling with -O2 enabled, where the instructions for formats .w and .d are not generated, but are instead optimized away. In this case, having shift amounts that are either negative or greater than the element bit size results in generation of incorrect results when constant folding. We remedy this by masking the operands for the nodes mentioned above before actually creating them, so that the final result is correct before placed into the constant pool. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D31331 llvm-svn: 300839
* [ARM] Fix handling of mapping symbols when changing sectionsJohn Brawn2017-04-201-1/+18
| | | | | | | | | | | ChangeSection incorrectly registers LastEMSInfo as belonging to the previous section, not the current section. This happens to work when changing sections using .section, as the previous section is set to the current section before the call to ChangeSection, but not when using .popsection. Differential Revision: https://reviews.llvm.org/D32225 llvm-svn: 300831
* [AArch64] Fix handling of zero immediate in fmov instructionsJohn Brawn2017-04-201-5/+5
| | | | | | | | | | | Currently fmov #0 with a vector destination is handle incorrectly and results in fmov #-1.9375 being emitted but should instead give an error. This is due to the way we cope with fmov #0 with a scalar destination being an alias of fmov zr, so fix this by actually doing it through an alias. Differential Revision: https://reviews.llvm.org/D31949 llvm-svn: 300830
* [AArch64] Fix handling of integer fp immediatesJohn Brawn2017-04-201-0/+14
| | | | | | | | When an integer is used as an fp immediate we're failing to check the return value of getFP64Imm, so invalid values are silently permitted. Fix this by merging together the integer and real handling. llvm-svn: 300828
* Fix bug that caused DwarfExpression to drop DW_OP_deref from FI locationsAdrian Prantl2017-04-191-0/+35
| | | | | | | | | | | - introduced in r300522 and found via the Swift LLDB testsuite. The fix is to set the location kind to memory whenever an FrameIndex location is emitted. rdar://problem/31707602 llvm-svn: 300793
* Simplify test for sret attribute in instcombineReid Kleckner2017-04-192-15/+29
| | | | | | | | | This change is correct because the verifier requires that at most one argument be marked 'sret'. NFC, removes a use of AttributeList slot APIs. llvm-svn: 300784
* Temporarily revert r299221 to fix nondeterminism in ThinLTO builder.Galina Kistanova2017-04-191-11/+17
| | | | llvm-svn: 300783
* X86FrameLowering: Fix getFrameIndexReference() for 'fixed' objectsMatthias Braun2017-04-191-0/+75
| | | | | | | | | | | Debug information is calculated with getFrameIndexReference() which was missing some logic for the fixed object cases (= parameters on the stack). rdar://24557797 Differential Revision: https://reviews.llvm.org/D32204 llvm-svn: 300781
* [sanitizer-coverage] remove some more stale codeKostya Serebryany2017-04-191-12/+0
| | | | llvm-svn: 300778
* [DAG] add splat vector support for 'or' in SimplifyDemandedBitsSanjay Patel2017-04-192-19/+15
| | | | | | | | | | | I've changed one of the tests to not fold away, but we didn't and still don't do the transform that the comment claims we do (and I don't know why we'd want to do that). Follow-up to: https://reviews.llvm.org/rL300725 https://reviews.llvm.org/rL300763 llvm-svn: 300772
* [sanitizer-coverage] remove stale codeKostya Serebryany2017-04-192-35/+0
| | | | llvm-svn: 300769
* [DAG] add splat vector support for 'xor' in SimplifyDemandedBitsSanjay Patel2017-04-195-45/+36
| | | | | | | | | This allows forming more 'not' ops, so we get improvements for ISAs that have and-not. Follow-up to: https://reviews.llvm.org/rL300725 llvm-svn: 300763
* ARMFrameLowering: Reserve emergency spill slot for large argumentsMatthias Braun2017-04-191-0/+94
| | | | | | | | | | | | | | | | Re-commit after revert in r300668. Changed getMaxFPOffset() to a more conservative heuristic instead of trying to be clever and missing for some exotic calling conventions. We need to reserve an emergency spill slot in cases with large argument types that could overflow immediate offsets for FP relative address calculations. rdar://31317893 Differential Revision: https://reviews.llvm.org/D31643 llvm-svn: 300761
* [InstCombine] Add frem constant folding test (PR3316)Simon Pilgrim2017-04-191-0/+9
| | | | llvm-svn: 300757
* AMDGPU: Custom lower illegal small select typesMatt Arsenault2017-04-191-117/+272
| | | | | | | Promote them to i32 vectors to avoid unpacking and re-packing the vectors. llvm-svn: 300754
* [InstCombine] Add frem constant folding test (PR32177)Simon Pilgrim2017-04-191-0/+9
| | | | llvm-svn: 300750
* [ARM] Use TableGen patterns to select vtbl. NFC.Eli Friedman2017-04-191-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D32103 llvm-svn: 300749
* [SCEV] Make SCEV or modeling more aggressive.Eli Friedman2017-04-191-0/+38
| | | | | | | | | | Use haveNoCommonBitsSet to figure out whether an "or" instruction is equivalent to addition. This handles more cases than just checking for a constant on the RHS. Differential Revision: https://reviews.llvm.org/D32239 llvm-svn: 300746
* Using address range map to speedup finding inline stack for address.Dehao Chen2017-04-191-0/+40
| | | | | | | | | | | | | | | | | | | | Summary: In the current implementation, to find inline stack for an address incurs expensive linear search in 2 places: * linear search for the top-level DIE * recursive linear traverse the DIE tree to find the path to the leaf DIE In this patch, a map is built from address to its corresponding leaf DIE. The inline stack is built by traversing from the leaf DIE up to the root DIE. This speeds up batch symbolization by ~10X without noticible memory overhead. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32177 llvm-svn: 300742
* Update the madd.ll test with utils/update_llc_test_checks.py (NFC)Dehao Chen2017-04-191-48/+264
| | | | llvm-svn: 300740
* PR32710: Disable using PMADDWD for unsigned short.Dehao Chen2017-04-191-5/+55
| | | | | | | | | | | | | | Summary: PMADDWD can only handle signed short. Reviewers: mkuper, wmi Reviewed By: mkuper Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D32236 llvm-svn: 300737
* AMDGPU: Don't emit amd_kernel_code_t for callable functionsMatt Arsenault2017-04-191-7/+7
| | | | | | | | | | | | This is inserted directly in the text section. The relocation for the function ends up resolving to the beginning of the amd_kernel_code_t header rather than the actual function entry point. Also skip some of the comments for initialization that only makes sense for kernels. llvm-svn: 300736
* [AMDGPU][mc][tests][NFC] Update bulk ISA tests for Gfx7 and Gfx8Artem Tamazov2017-04-192-4377/+10806
| | | | | | Added approx. 1100 gfx7 and 1040 gfx8 test cases. llvm-svn: 300734
* StructurizeCFG: Directly invert cmp instructionsMatt Arsenault2017-04-197-110/+172
| | | | | | | | | | | | | | | | The most common case for a branch condition is a single use compare. Directly invert the branch predicate rather than adding a lot of xor i1 true which the DAG will have to fold later. This produces nicer to read structurizer output. This produces some random changes in codegen due to the DAG swapping branch conditions itself, and then does a poor job of dealing with those inverts. llvm-svn: 300732
* [GVN] Don't coerce non-integral pointers to integers or vice versaSanjoy Das2017-04-192-0/+78
| | | | | | | | | | | | | | | | | Summary: See http://llvm.org/docs/LangRef.html#non-integral-pointer-type The NewGVN test does not fail without these changes (perhaps it does try to coerce pointers <-> integers to begin with?), but I added the test case anyway. Reviewers: dberlin Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D32208 llvm-svn: 300730
OpenPOWER on IntegriCloud