summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Extract LaneBitmask into a separate typeKrzysztof Parzyszek2016-12-1540-326/+421
| | | | | | | | | | | | Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820
* [CostModel][X86] Updated reverse shuffle costsSimon Pilgrim2016-12-152-37/+151
| | | | llvm-svn: 289819
* [TEST] Initial commit of tests for minmax horizontal reductions.Alexey Bataev2016-12-151-0/+1725
| | | | llvm-svn: 289817
* Revert "[TESTS] Initial commit of tests, by Andrew Tischenko"Alexey Bataev2016-12-152-350/+0
| | | | | | This reverts commit ee709f8988653a0334fbf100cdbbdd83a3933347. llvm-svn: 289814
* [InstCombine] New opportunities for FoldAndOfICmp and FoldXorOfICmpEhsan Amiri2016-12-153-2/+302
| | | | | | | | | | | | | | | | | A number of new patterns for simplifying and/xor of icmp: (icmp ne %x, 0) ^ (icmp ne %y, 0) => icmp ne %x, %y if the following is true: 1- (%x = and %a, %mask) and (%y = and %b, %mask) 2- %mask is a power of 2. (icmp eq %x, 0) & (icmp ne %y, 0) => icmp ult %x, %y if the following is true: 1- (%x = and %a, %mask1) and (%y = and %b, %mask2) 2- Let %t be the smallest power of 2 where %mask1 & %t != 0. Then for any %s that is a power of 2 and %s & %mask2 != 0, we must have %s <= %t. For example if %mask1 = 24 and %mask2 = 16, setting %s = 16 and %t = 8 violates condition (2) above. So this optimization cannot be applied. llvm-svn: 289813
* [CostModel] Fix long standing bug with reverse shuffle mask detectionSimon Pilgrim2016-12-152-1/+32
| | | | | | Incorrect 'undef' mask index matching meant that broadcast shuffles could be detected as reverse shuffles llvm-svn: 289811
* [TESTS] Initial commit of tests, by Andrew TischenkoAlexey Bataev2016-12-152-0/+350
| | | | llvm-svn: 289807
* [Power9] Allow AnyExt immediates for XXSPLTIBNemanja Ivanovic2016-12-153-7/+16
| | | | | | | | | | In some situations, the BUILD_VECTOR node that builds a v18i8 vector by a splat of an i8 constant will end up with signed 8-bit values and other situations, it'll end up with unsigned ones. Handle both situations. Fixes PR31340. llvm-svn: 289804
* [AVR] Support floats in the instrumention passDylan McKay2016-12-152-18/+35
| | | | | | This also refactors some common code into the 'GetTypeName' method. llvm-svn: 289803
* [CostModel][X86] Add tests for reverse shuffle costsSimon Pilgrim2016-12-151-0/+143
| | | | llvm-svn: 289800
* Add missing triple target for numeric section flag testPrakhar Bahuguna2016-12-151-1/+1
| | | | llvm-svn: 289798
* Simplify format member detection in FormatVariadicPavel Labath2016-12-157-161/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This replaces the format member search, which was quite complicated, with a more direct approach to detecting whether a class should be formatted using the format-member method. Instead we use a special type llvm::format_adapter, which every adapter must inherit from. Then the search can be simply implemented with the is_base_of type trait. Aside from the simplification, I like this way more because it makes it more explicit that you are supposed to use this type only for adapter-like formattings, and the other approach (format_provider overloads) should be used as a default (a mistake I made when first trying to use this library). The only slight change in behaviour here is that now choose the format-adapter branch even if the format member invocation will fail to compile (e.g. because it is a non-const member function and we are passing a const adapter), whereas previously we would have gone on to search for format_providers for the type. However, I think that is actually a good thing, as it probably means the programmer did something wrong. Reviewers: zturner, inglorion Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27679 llvm-svn: 289795
* [Thumb] Teach ISel how to lower compares of AND bitmasks efficientlySjoerd Meijer2016-12-159-23/+232
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is essentially a recommit of r285893, but with a correctness fix. The problem of the original commit was that this: bic r5, r7, #31 cbz r5, .LBB2_10 got rewritten into: lsrs r5, r7, #5 beq .LBB2_10 The result in destination register r5 is not the same and this is incorrect when r5 is not dead. So this fix includes checking the uses of the AND destination register. And also, compared to the original commit, some regression tests didn't need changing anymore because of this extra check. For completeness, this was the original commit message: For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)). 1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS. 2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS. 3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS). 4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask. 1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win. Differential Revision: https://reviews.llvm.org/D27761 llvm-svn: 289794
* [AVR] Add argument indices to the instrumention hook functionsDylan McKay2016-12-151-2/+4
| | | | | | | This allows the instrumention hook functions to do better pretty-printing. llvm-svn: 289793
* Fix for build warning in execute-only supportPrakhar Bahuguna2016-12-151-2/+2
| | | | llvm-svn: 289788
* Allow ELF section flags to be specified numericallyPrakhar Bahuguna2016-12-152-0/+41
| | | | | | | | | | | | | | | | | Summary: GAS already allows flags for sections to be specified directly as a numeric value. This functionality is particularly useful for setting processor or application-specific values that may not be directly supported or understood by LLVM. This patch allows LLVM to use numeric section flag values verbatim if specified by the assembly file. Reviewers: grosbach, rafael, t.p.northover, rengolin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27451 llvm-svn: 289785
* [ARM] Implement execute-only support in CodeGenPrakhar Bahuguna2016-12-1527-20/+455
| | | | | | | | | | | | | | | | | | | | This implements execute-only support for ARM code generation, which prevents the compiler from generating data accesses to code sections. The following changes are involved: * Add the CodeGen option "-arm-execute-only" to the ARM code generator. * Add the clang flag "-mexecute-only" as well as the GCC-compatible alias "-mpure-code" to enable this option. * When enabled, literal pools are replaced with MOVW/MOVT instructions, with VMOV used in addition for floating-point literals. As the MOVT instruction is required, execute-only support is only available in Thumb mode for targets supporting ARMv8-M baseline or Thumb2. * Jump tables are placed in data sections when in execute-only mode. * The execute-only text section is assigned section ID 0, and is marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'. This also overrides selection of ELF sections for globals. llvm-svn: 289784
* Add missing -mtriple to MIR test caseSanjoy Das2016-12-151-1/+1
| | | | llvm-svn: 289779
* Attempt to fix llvm-readobj crash on ppc64 due to r289674Yaxun Liu2016-12-151-1/+1
| | | | llvm-svn: 289777
* Fix go bindings after r289702 (hopefully, don't really know how to buildDaniel Jasper2016-12-151-2/+2
| | | | | | them, build.sh seems to be broken). llvm-svn: 289775
* [libFuzzer] enable the failure-resistant merge by default (with ↵Kostya Serebryany2016-12-154-27/+31
| | | | | | trace-pc-guard only) llvm-svn: 289772
* [AVR] Whitelist the avrlit config environment variablesDylan McKay2016-12-151-1/+1
| | | | | | This allows us to use `lit` to run on-target execution tests. llvm-svn: 289769
* Revert part of r289765 that is not necessaryHal Finkel2016-12-151-2/+2
| | | | | | | | CS.doesNotAccessMemory(ArgNo) and CS.onlyReadsMemory(ArgNo) calls dataOperandHasImpliedAttr, so revert this part of r289765 because it should not be necessary. llvm-svn: 289768
* Trying to fix NDEBUG build after r289764Hal Finkel2016-12-151-0/+2
| | | | llvm-svn: 289766
* Fix argument attribute queries with bundle operandsHal Finkel2016-12-152-4/+6
| | | | | | | When iterating over data operands in AA, don't make argument-attribute-specific queries on bundle operands. Trying to fix self hosting... llvm-svn: 289765
* [MachineBlockPlacement] Don't make blocks "uneditable"Sanjoy Das2016-12-153-7/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes an issue with MachineBlockPlacement due to a badly timed call to `analyzeBranch` with `AllowModify` set to true. The timeline is as follows: 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls `TailDup.shouldTailDuplicate` on its argument, which in turn calls `analyzeBranch` with `AllowModify` set to true. 2. This `analyzeBranch` call edits the terminator sequence of the block based on the physical layout of the machine function, turning an unanalyzable non-fallthrough block to a unanalyzable fallthrough block. Normally MBP bails out of rearranging such blocks, but this block was unanalyzable non-fallthrough (and thus rearrangeable) the first time MBP looked at it, and so it goes ahead and decides where it should be placed in the function. 3. When placing this block MBP fails to analyze and thus update the block in keeping with the new physical layout. Concretely, before (1) we have something like: ``` LBL0: < unknown terminator op that may branch to LBL1 > jmp LBL1 LBL1: ... A LBL2: ... B ``` In (2), analyze branch simplifies this to ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump removed LBL1: ... A LBL2: ... B ``` In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2 after the first block since that is profitable. ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump LBL2: ... B LBL1: ... A ``` and the program now has incorrect behavior (we no longer fall-through from `LBL0` to `LBL1`) because MBP can no longer edit LBL0. There are several possible solutions, but I went with removing the teeth off of the `analyzeBranch` calls in TailDuplicator. That makes thinking about the result of these calls easier, and breaks nothing in the lit test suite. I've also added some bookkeeping to the MachineBlockPlacement pass and used that to write an assert that would have caught this. Reviewers: chandlerc, gberry, MatzeB, iteratee Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27783 llvm-svn: 289764
* [AVX-512][InstCombine] Add masked scalar FMA intrinsics to ↵Craig Topper2016-12-153-0/+435
| | | | | | SimplifyDemandedVectorElts. llvm-svn: 289759
* Fix iterator-invalidation issueHal Finkel2016-12-151-1/+3
| | | | | | | | | Inserting a new key into a DenseMap potentially invalidates iterators into that map. Trying to fix an issue from r289755 triggering this assertion: Assertion `isHandleInSync() && "invalid iterator access!"' failed. llvm-svn: 289757
* Remove the AssumptionCacheHal Finkel2016-12-1599-1358/+572
| | | | | | | | | After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... llvm-svn: 289756
* Make processing @llvm.assume more efficient by using operand bundlesHal Finkel2016-12-1519-149/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | There was an efficiency problem with how we processed @llvm.assume in ValueTracking (and other places). The AssumptionCache tracked all of the assumptions in a given function. In order to find assumptions relevant to computing known bits, etc. we searched every assumption in the function. For ValueTracking, that means that we did O(#assumes * #values) work in InstCombine and other passes (with a constant factor that can be quite large because we'd repeat this search at every level of recursion of the analysis). Several of us discussed this situation at the last developers' meeting, and this implements the discussed solution: Make the values that an assume might affect operands of the assume itself. To avoid exposing this detail to frontends and passes that need not worry about it, I've used the new operand-bundle feature to add these extra call "operands" in a way that does not affect the intrinsic's signature. I think this solution is relatively clean. InstCombine adds these extra operands based on what ValueTracking, LVI, etc. will need and then those passes need only search the users of the values under consideration. This should fix the computational-complexity problem. At this point, no passes depend on the AssumptionCache, and so I'll remove that as a follow-up change. Differential Revision: https://reviews.llvm.org/D27259 llvm-svn: 289755
* Add testcases for some shuffle bugs.Eli Friedman2016-12-152-0/+205
| | | | | | | See https://llvm.org/bugs/show_bug.cgi?id=31301 and https://llvm.org/bugs/show_bug.cgi?id=31364 . llvm-svn: 289751
* Fix test/tools/lto/hide-linkonce-odr.ll after r289719Nico Weber2016-12-151-0/+1
| | | | llvm-svn: 289750
* [NVPTX] Remove dead #defines from NVPTXUtilities.h.Justin Lebar2016-12-151-3/+0
| | | | llvm-svn: 289747
* Use PIC relocation model as default for PowerPC64 ELF.Joerg Sonnenberger2016-12-1520-35/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of the PowerPC64 code generation for the ELF ABI is already PIC. There are four main exceptions: (1) Constant pointer arrays etc. should in writeable sections. (2) The TOC restoration NOP after a call is needed for all global symbols. While GNU ld has a workaround for questionable GCC self-calls, we trigger the checks for calls from COMDAT sections as they cross input sections and are therefore not considered self-calls. The current decision is questionable and suboptimal, but outside the scope of the change. (3) TLS access can not use the initial-exec model. (4) Jump tables should use relative addresses. Note that the current encoding doesn't work for the large code model, but it is more compact than the default for any non-trivial jump table. Improving this is again beyond the scope of this change. At least (1) and (3) are assumptions made in target-independent code and introducing additional hooks is a bit messy. Testing with clang shows that a -fPIC binary is 600KB smaller than the corresponding -fno-pic build. Separate testing from improved jump table encodings would explain only about 100KB or so. The rest is expected to be a result of more aggressive immediate forming for -fno-pic, where the -fPIC binary just uses TOC entries. This change brings the LLVM output in line with the GCC output, other PPC64 compilers like XLC on AIX are known to produce PIC by default as well. The relocation model can still be provided explicitly, i.e. when using MCJIT. One test case for case (1) is included, other test cases with relocation mode sensitive behavior are wired to static for now. They will be reviewed and adjusted separately. Differential Revision: https://reviews.llvm.org/D26566 llvm-svn: 289743
* [AMDGPU] Fix runtime-metadata.ll test so it doesn't leave an object file in ↵Justin Lebar2016-12-141-1/+1
| | | | | | the source tree. llvm-svn: 289742
* [NVPTX] Remove dead code.Justin Lebar2016-12-145-130/+0
| | | | | | | | | | | I've chosen to remove NVPTXInstrInfo::CanTailMerge but not NVPTXInstrInfo::isLoadInstr and isStoreInstr (which are also dead) because while the latter two are reasonably useful utilities, the former cannot be used safely: It relies on successful address space inference to identify writes to shared memory, but addrspace inference is a best-effort thing. llvm-svn: 289740
* [DAG] allow more select folding for targets that have 'and not' (PR31175)Sanjay Patel2016-12-144-20/+44
| | | | | | | | | | | | | | The original motivation for this patch comes from wanting to canonicalize more IR to selects and also canonicalizing min/max. If we're going to do that, we need more backend fixups to undo select codegen when simpler ops will do. I chose AArch64 for the tests because that shows the difference in the simplest way. This should fix: https://llvm.org/bugs/show_bug.cgi?id=31175 Differential Revision: https://reviews.llvm.org/D27489 llvm-svn: 289738
* [gold] Add datalayout to two tests where it was missing.Davide Italiano2016-12-142-0/+6
| | | | | | Reported by: thakis via chromium bots. llvm-svn: 289737
* [Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; ↵Eugene Zelenko2016-12-145-284/+249
| | | | | | other minor fixes (NFC). llvm-svn: 289736
* Add the ability to get attribute values as Optional<T>Greg Clayton2016-12-146-92/+171
| | | | | | | | | | | | | | | | | | | | | | When getting attributes it is sometimes nicer to use Optional<T> some of the time instead of magic values. I tried to cut over to only using the Optional values but it made many of the call sites very messy, so it makes sense the leave in the calls that can return a default value. Otherwise code that looks like this: uint64_t CallColumn = Die.getAttributeValueAsAddress(DW_AT_call_line, 0); Has to be turned into: uint64_t CallColumn = 0; if (auto CallColumnValue = Die.getAttributeValueAsAddress(DW_AT_call_line)) CallColumn = *CallColumnValue; The first snippet of code looks much better. But in cases where you want an offset that may or may not be there, the following code looks better: if (auto StmtOffset = Die.getAttributeValueAsSectionOffset(DW_AT_stmt_list)) { // Use StmtOffset } Differential Revision: https://reviews.llvm.org/D27772 llvm-svn: 289731
* Whitespace cleanup in test/CodeGen/NVPTX/annotations.ll.Justin Lebar2016-12-141-4/+0
| | | | llvm-svn: 289730
* [NVPTX] Support .maxnreg annotation.Justin Lebar2016-12-144-3/+22
| | | | | | | | | | Reviewers: tra Subscribers: llvm-commits, jholewinski Differential Revision: https://reviews.llvm.org/D27638 llvm-svn: 289729
* [NVPTX] Remove string constants from NVPTXBaseInfo.h.Justin Lebar2016-12-143-165/+88
| | | | | | | | | | | | | | | | | | Summary: Previously they were defined as a 2D char array in a header file. This is kind of overkill -- we can let the linker lay out these strings however it pleases. While we're at it, we might as well just inline these constants where they're used, as each of them is used only once. Also move NVPTXUtilities.{h,cpp} into namespace llvm. Reviewers: tra Subscribers: jholewinski, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D27636 llvm-svn: 289728
* LibDriver: Reject inputs that are not COFF objects or bitcode files.Peter Collingbourne2016-12-144-1/+10
| | | | | | | | Fixes PR31372. Differential Revision: https://reviews.llvm.org/D27776 llvm-svn: 289726
* Only sets profile summary when it was not preset.Dehao Chen2016-12-142-1/+3
| | | | | | | | | | | | Summary: SampleProfileLoader pass may be invoked twice by LTO. The 2nd pass should not append more summary info as it is already preset by the 1st pass. Reviewers: eraman, davidxl Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D27733 llvm-svn: 289725
* Fix the bug in r289714 (NFC).Dehao Chen2016-12-141-1/+1
| | | | llvm-svn: 289724
* Revert revision 289721.Jan Sjodin2016-12-141-1/+0
| | | | llvm-svn: 289723
* Dummy commit.Jan Sjodin2016-12-141-0/+1
| | | | llvm-svn: 289721
* [LTO] Add the missing datalayout in a test.Davide Italiano2016-12-141-0/+1
| | | | llvm-svn: 289720
* [LTO] Reject modules without datalayout.Davide Italiano2016-12-1490-10/+162
| | | | | | | | | | | Also, udpate the ~60 failing tests in the tree which did not contain a valid datalayout. This fixes PR31123. lld will be updated in a following patch, immediately after this is committed. Differential Revision: https://reviews.llvm.org/D27082 llvm-svn: 289719
OpenPOWER on IntegriCloud