summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* AVX-512: fixed a bug in fp_to_uint pattern on KNLElena Demikhovsky2016-03-291-151/+499
| | | | | | | | | Fixed fp_to_uint instruction selection on KNL. One pattern was missing for <4 x double> to <4 x i32> Differential Revision: http://reviews.llvm.org/D18512 llvm-svn: 264701
* BitcodeReader: Allow METADATA_STRINGS to only have !""Duncan P. N. Exon Smith2016-03-291-0/+7
| | | | | | | Support parsing a METADATA_STRINGS record that only has a single piece of metadata, !"". Fixes a corner case in r264551. llvm-svn: 264699
* [SimlifyCFG] Prevent passes from destroying canonical loop structure, ↵Hyojin Sung2016-03-295-42/+42
| | | | | | | | | | | | | | | | | especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264697
* [PowerPC] Refactor popcnt[dw] target featuresHal Finkel2016-03-291-0/+2
| | | | | | | | | Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690
* [Codegen] Decrease minimum jump table density.Kyle Butt2016-03-298-30/+111
| | | | | | | | | | | Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689
* regenerate checksSanjay Patel2016-03-283-74/+107
| | | | llvm-svn: 264677
* fix checks: *_DAG -> *-DAGSanjay Patel2016-03-282-4/+4
| | | | llvm-svn: 264676
* [Coverage] Fix the expected counts in instrprof-comdat.hVedant Kumar2016-03-281-6/+3
| | | | llvm-svn: 264675
* fix CHECK_NEXT -> CHECK-NEXTSanjay Patel2016-03-282-2/+2
| | | | llvm-svn: 264674
* fix CHECK_DAG -> CHECK-DAGSanjay Patel2016-03-282-4/+4
| | | | llvm-svn: 264673
* fix CHECK_NEXT -> CHECK-NEXTSanjay Patel2016-03-281-4/+4
| | | | llvm-svn: 264672
* fix CHECK_LABEL -> CHECK-LABELSanjay Patel2016-03-281-16/+16
| | | | llvm-svn: 264671
* trailing whitespaceSanjay Patel2016-03-281-315/+315
| | | | llvm-svn: 264670
* Remove personality for declarations in CloneModule.Evgeniy Stepanov2016-03-281-0/+18
| | | | | | | | | | Personality is copied as part of copyFunctionAttributes, but it is invalid on a declaration. Remove the personality attribute it the function body is not cloned. Also add a verifier run over output modules in the llvm-split tool. llvm-svn: 264667
* [X86][SSE] Vectorize a bit (AND/XOR/OR) op if a BUILD_VECTOR has the same op ↵Simon Pilgrim2016-03-284-214/+119
| | | | | | | | | | | | | | for all their scalar elements. If all a BUILD_VECTOR's source elements are the same bit (AND/XOR/OR) operation type and each has one constant operand, lower to a pair of BUILD_VECTOR and just apply the bit operation to the vectors. The constant operands will form a constant vector meaning that we still only have a single BUILD_VECTOR to lower and we will have replaced all the scalarized operations with a single SSE equivalent. Its not in our interest to start make a general purpose vectorizer from this, but I'm seeing enough of these scalar bit operations from the later legalization/scalarization stages to support them at least. Differential Revision: http://reviews.llvm.org/D18492 llvm-svn: 264666
* fix CHECK_NEXT -> CHECK-NEXTSanjay Patel2016-03-281-15/+15
| | | | llvm-svn: 264661
* Reapply (2x) "[PGO] Fix name encoding for ObjC-like functions"Vedant Kumar2016-03-289-7/+15
| | | | | | | | | | | | | | | | | | | | | | | | Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). What's changed since the original commit? - I fixed up the covmap-V2 binary format tests using a linux VM. - I weakened the CHECK lines in instrprof-comdat.h to account for the fact that there have been bugfixes to clang coverage. These will be fixed up in a follow-up. - I added an assert to make sure we don't get bitten by this again. - I constructed the c-general.profraw file without name compression enabled to appease some bots. Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264658
* Add an IR Verifier check for orphaned DICompileUnits.Adrian Prantl2016-03-283-2/+17
| | | | | | | | | A DICompileUnit that is not listed in llvm.dbg.cu will cause assertion failures and/or crashes in the backend. The Verifier should reject this. rdar://problem/25369499 llvm-svn: 264657
* [LVers] Change CHECK_LABEL to CHECK-LABEL (underscore->dash)Adam Nemet2016-03-281-1/+1
| | | | | | Thanks to Sanjoy for catching this. llvm-svn: 264656
* [asan] Fix testcase for r264645Ryan Govostes2016-03-281-6/+6
| | | | llvm-svn: 264652
* Handle section vs global name conflict.Evgeniy Stepanov2016-03-282-6/+138
| | | | | | | | | | | | | This is a fix for PR26941. When there is both a section and a global definition with the same name, the global wins. Section symbols are not added to the symbol table; section references are left undefined and fixed up in the object writer unless they've been satisfied by some other definition. llvm-svn: 264649
* [asan] Support dead code stripping on Mach-O platformsRyan Govostes2016-03-282-1/+38
| | | | | | | | | | | | | | | | | | On OS X El Capitan and iOS 9, the linker supports a new section attribute, live_support, which allows dead stripping to remove dead globals along with the ASAN metadata about them. With this change __asan_global structures are emitted in a new __DATA,__asan_globals section on Darwin. Additionally, there is a __DATA,__asan_liveness section with the live_support attribute. Each entry in this section is simply a tuple that binds together the liveness of a global variable and its ASAN metadata structure. Thus the metadata structure will be alive if and only if the global it references is also alive. Review: http://reviews.llvm.org/D16737 llvm-svn: 264645
* Revert "Reapply "[PGO] Fix name encoding for ObjC-like functions""Vedant Kumar2016-03-289-12/+7
| | | | | | | This reverts commit r264641 to investigate why c-general.test is failing on the bots. llvm-svn: 264643
* Reapply "[PGO] Fix name encoding for ObjC-like functions"Vedant Kumar2016-03-289-7/+12
| | | | | | | | | | | | | | | | | | | | | Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). This reverts the revert commit beaf3d18. What's changed? - I fixed up the covmap-V2 binary format tests using a linux VM. - I updated the expected counts in instrprof-comdat.h to account for the fact that there have been bugfixes to clang coverage. - I added an assert to make sure we don't get bitten by this again. Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264641
* MIRParser: Add %subreg.xxx syntax for subregister index operandsMatthias Braun2016-03-282-0/+58
| | | | | | Differential Revision: http://reviews.llvm.org/D18279 llvm-svn: 264608
* CodeGen: Correct specification of PHI nodesMatthias Braun2016-03-281-2/+2
| | | | | | | | | | They do have a def machine operand. Fixing the definition is necessary for an upcoming patch. Differential Revision: http://reviews.llvm.org/D18384 llvm-svn: 264607
* [AArch64] Do not lower scalar sdiv/udiv to a shifts + mul sequence when ↵Haicheng Wu2016-03-281-0/+45
| | | | | | | | optimizing for minsize Mimic what x86 does when optimizing sdiv/udiv for minsize. llvm-svn: 264606
* Revert "[SimlifyCFG] Prevent passes from destroying canonical loop ↵Reid Kleckner2016-03-284-40/+40
| | | | | | | | | | structure, especially for nested loops" This reverts commit r264596. It does not compile. llvm-svn: 264604
* [PowerPC] On the A2, popcnt[dw] are very slowHal Finkel2016-03-281-4/+18
| | | | | | | | | | | | | | | | The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600
* [SimlifyCFG] Prevent passes from destroying canonical loop structure, ↵Hyojin Sung2016-03-284-40/+40
| | | | | | | | | | | | | | | | especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264596
* Introduce MachineFunctionProperties and the AllVRegsAllocated propertyDerek Schuff2016-03-285-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | MachineFunctionProperties represents a set of properties that a MachineFunction can have at particular points in time. Existing examples of this idea are MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which will eventually be switched to use this mechanism. This change introduces the AllVRegsAllocated property; i.e. the property that all virtual registers have been allocated and there are no VReg operands left. With this mechanism, passes can declare that they require a particular property to be set, or that they set or clear properties by implementing e.g. MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class verifies that the requirements are met, and handles the setting and clearing based on the delcarations. Passes can also directly query and update the current properties of the MF if they want to have conditional behavior. This change annotates the target-independent post-regalloc passes; future changes will also annotate target-specific ones. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D18421 llvm-svn: 264593
* [llvm-size] Implement --common optionHemant Kulkarni2016-03-281-0/+29
| | | | | | Differential Revision: http://reviews.llvm.org/D16820 llvm-svn: 264591
* AMDGPU/SI: Limit load clustering to 16 bytes instead of 4 instructionsTom Stellard2016-03-283-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This helps prevent load clustering from drastically increasing register pressure by trying to cluster 4 SMRDx8 loads together. The limit of 16 bytes was chosen, because it seems like that was the original intent of setting the limit to 4 instructions, but more analysis could show that a different limit is better. This fixes yields small decreases in register usage with shader-db, but also helps avoid a large increase in register usage when lane mask tracking is enabled in the machine scheduler, because lane mask tracking enables more opportunities for load clustering. shader-db stats: 2379 shaders in 477 tests Totals: SGPRS: 49744 -> 48600 (-2.30 %) VGPRS: 34120 -> 34076 (-0.13 %) Code Size: 1282888 -> 1283184 (0.02 %) bytes LDS: 28 -> 28 (0.00 %) blocks Scratch: 495616 -> 492544 (-0.62 %) bytes per wave Max Waves: 6843 -> 6853 (0.15 %) Wait states: 0 -> 0 (0.00 %) Reviewers: nhaehnle, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18451 llvm-svn: 264589
* [SimplifyLibCalls] Transform printf("%s", "a") -> putchar('a').Davide Italiano2016-03-281-0/+12
| | | | llvm-svn: 264588
* [Hexagon] Improve handling of unaligned vector loads and storesKrzysztof Parzyszek2016-03-281-0/+31
| | | | llvm-svn: 264584
* [Hexagon] Only use restore functions for single register at -OzKrzysztof Parzyszek2016-03-281-0/+42
| | | | llvm-svn: 264581
* Sparc: silently ignore .proc assembler directiveDouglas Katzman2016-03-281-0/+4
| | | | | | Differential Revision: http://reviews.llvm.org/D18463 llvm-svn: 264579
* [lanai] Add Lanai backend.Jacques Pienaar2016-03-2820-8/+2348
| | | | | | | | | | Add the Lanai backend to lib/Target. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17011 llvm-svn: 264578
* [Power9] Implement new altivec instructions: bcd* seriesChuang-Yu Cheng2016-03-282-0/+86
| | | | | | | | | | | | | | | | | | | | | | | | | This patch implements the following altivec instructions: - Decimal Convert From/to National/Zoned/Signed-QWord: bcdcfn. bcdcfz. bcdctn. bcdctz. bcdcfsq. bcdctsq. - Decimal Copy-Sign/Set-Sign: bcdcpsgn. bcdsetsgn. - Decimal Shift/Unsigned-Shift/Shift-and-Round: bcds. bcdus. bcdsr. - Decimal (Unsigned) Truncate: bcdtrunc. bcdutrunc. Total 13 instructions Thanks Amehsan's advice! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D17838 llvm-svn: 264568
* [Power9] Implement new vsx instructions: insert, extract, test data class, ↵Chuang-Yu Cheng2016-03-282-0/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | min/max, reverse, permute, splat This change implements the following vsx instructions: - Scalar Insert/Extract xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp - Vector Insert/Extract xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp xxextractuw xxinsertw - Scalar/Vector Test Data Class xststdcdp xststdcsp xststdcqp xvtstdcdp xvtstdcsp - Maximum/Minimum xsmaxcdp xsmaxjdp xsmincdp xsminjdp - Vector Byte-Reverse/Permute/Splat xxbrd xxbrh xxbrq xxbrw xxperm xxpermr xxspltib 30 instructions Thanks Nemanja for invaluable discussion! Thanks Kit's great help! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16842 llvm-svn: 264567
* AVX-512: Fixed ICMP instruction selection for i1 operandsElena Demikhovsky2016-03-281-21/+111
| | | | | | | | | | ICMP instruction selection fails on SKX and KNL for i1 operand. I use XOR to resolve: (A == B) is equivalent to (A xor B) == 0 Differential Revision: http://reviews.llvm.org/D18511 llvm-svn: 264566
* [Power9] Implement new vsx instructions: quad-precision move, fp-arithmeticChuang-Yu Cheng2016-03-282-0/+140
| | | | | | | | | | | | | | | | | | | | This change implements the following vsx instructions: - quad-precision move xscpsgnqp, xsabsqp, xsnegqp, xsnabsqp - quad-precision fp-arithmetic xsaddqp(o) xsdivqp(o) xsmulqp(o) xssqrtqp(o) xssubqp(o) xsmaddqp(o) xsmsubqp(o) xsnmaddqp(o) xsnmsubqp(o) 22 instructions Thanks Nemanja and Kit for careful review and invaluable discussion! Reviewers: hal, nemanja, kbarton, tjablin, amehsan http://reviews.llvm.org/D16110 llvm-svn: 264565
* llvm/test/Transforms/FunctionImport/funcimport.ll: -stats REQUIRES +Asserts.NAKAMURA Takumi2016-03-281-0/+2
| | | | llvm-svn: 264561
* Reapply ~"Bitcode: Collect all MDString records into a single blob"Duncan P. N. Exon Smith2016-03-271-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Spiritually reapply commit r264409 (reverted in r264410), albeit with a bit of a redesign. Firstly, avoid splitting the big blob into multiple chunks of strings. r264409 imposed an arbitrary limit to avoid a massive allocation on the shared 'Record' SmallVector. The bug with that commit only reproduced when there were more than "chunk-size" strings. A test for this would have been useless long-term, since we're liable to adjust the chunk-size in the future. Thus, eliminate the motivation for chunk-ing by storing the string sizes in the blob. Here's the layout: vbr6: # of strings vbr6: offset-to-blob blob: [vbr6]: string lengths [char]: concatenated strings Secondly, make the output of llvm-bcanalyzer readable. I noticed when debugging r264409 that llvm-bcanalyzer was outputting a massive blob all in one line. Past a small number, the strings were impossible to split in my head, and the lines were way too long. This version adds support in llvm-bcanalyzer for pretty-printing. <STRINGS abbrevid=4 op0=3 op1=9/> num-strings = 3 { 'abc' 'def' 'ghi' } From the original commit: Inspired by Mehdi's similar patch, http://reviews.llvm.org/D18342, this should (a) slightly reduce bitcode size, since there is less record overhead, and (b) greatly improve reading speed, since blobs are super cheap to deserialize. llvm-svn: 264551
* Support: Implement StreamingMemoryObject::getPointerDuncan P. N. Exon Smith2016-03-272-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | The implementation is fairly obvious. This is preparation for using some blobs in bitcode. For clarity (and perhaps future-proofing?), I moved the call to JumpToBit in BitstreamCursor::readRecord ahead of calling MemoryObject::getPointer, since JumpToBit can theoretically (a) read bytes, which (b) invalidates the blob pointer. This isn't strictly necessary the two memory objects we have: - The return of RawMemoryObject::getPointer is valid until the memory object is destroyed. - StreamingMemoryObject::getPointer is valid until the next chunk is read from the stream. Since the JumpToBit call is only going ahead to a word boundary, we'll never load another chunk. However, reordering makes it clear by inspection that the blob returned by BitstreamCursor::readRecord will be valid. I added some tests for StreamingMemoryObject::getPointer and BitstreamCursor::readRecord. llvm-svn: 264549
* Use DAG check to try to appease botTeresa Johnson2016-03-271-1/+1
| | | | | | | | Try to appease http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/34772. This was the only check that didn't use DAG and it wasn't found. llvm-svn: 264538
* [ThinLTO] Add optional import message and statisticsTeresa Johnson2016-03-271-1/+14
| | | | | | | | | | | | | | | | | | | | | | Summary: Add a statistic to count the number of imported functions. Also, add a new -print-imports option to emit a trace of imported functions, that works even for an NDEBUG build. Note that emitOptimizationRemark does not work for the above printing as it expects a Function object and DebugLoc, neither of which we have with summary-based importing. This is part 2 of D18487, the first part was committed separately as r264536. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18487 llvm-svn: 264537
* [PowerPC] Map max/minnum intrinsics and fmax/fmin to ISD nodes for CTR-based ↵Hal Finkel2016-03-271-4/+191
| | | | | | | | | | | | | | | | | loop legality Intrinsic::maxnum and Intrinsic::minnum, along with the associated libc function calls (fmax[f], etc.) generally map to function calls after lowering. For some vector types with QPX at least, however, we can legally lower these, and we don't need to prohibit CTR-based loops on their account. It turned out, however, that the logic that checked the opcodes associated with intrinsics was broken (it would set the Opcode variable, but that variable was later checked only if set for some otherwise-external function call. This fixes the latter problem and adds the FMAX/MINNUM mappings. llvm-svn: 264532
* [Verifier] Reject PHIs using defs from own block.Michael Kruse2016-03-262-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reject the following IR as malformed (assuming that %entry, %next are not in a loop): next: %y = phi i32 [ 0, %entry ] %x = phi i32 [ %y, %entry ] Such PHI nodes came up in PR26718. While there was no consensus on whether or not this is valid IR, most opinions on that bug and in a discussion on the llvm-dev mailing list tended towards a "strict interpretation" (term by Joseph Tremoulet) of PHI node uses. Also, the language reference explicitly states that "the use of each incoming value is deemed to occur on the edge from the corresponding predecessor block to the current block" and `DominatorTree::dominates(Instruction*, Use&)` uses this definition as well. For the code mentioned in PR15384, clang does not compile to such PHIs (anymore?). The test case still hangs when replacing `%tmp6` with `%tmp` in revisions before r176366 (where PR15384 has been fixed). The occurrence of %tmp6 therefore was probably unintentional. Its value is not used except in other PHIs. Reviewers: majnemer, reames, JosephTremoulet, bkramer, grosser, jdoerfert, kparzysz, sanjoy Differential Revision: http://reviews.llvm.org/D18443 llvm-svn: 264528
* [SimplifyCFG] propagate branch metadata when creating select (PR26636)Sanjay Patel2016-03-261-2/+5
| | | | llvm-svn: 264527
OpenPOWER on IntegriCloud