summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SimplifyCFG] add a struct to house optional folds (PR34603)Sanjay Patel2017-09-274-76/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was intended to be no-functional-change, but it's not - there's a test diff. So I thought I should stop here and post it as-is to see if this looks like what was expected based on the discussion in PR34603: https://bugs.llvm.org/show_bug.cgi?id=34603 Notes: 1. The test improvement occurs because the existing 'LateSimplifyCFG' marker is not carried through the recursive calls to 'SimplifyCFG()->SimplifyCFGOpt().run()->SimplifyCFG()'. The parameter isn't passed down, so we pick up the default value from the function signature after the first level. I assumed that was a bug, so I've passed 'Options' down in all of the 'SimplifyCFG' calls. 2. I split 'LateSimplifyCFG' into 2 bits: ConvertSwitchToLookupTable and KeepCanonicalLoops. This would theoretically allow us to differentiate the transforms controlled by those params independently. 3. We could stash the optional AssumptionCache pointer and 'LoopHeaders' pointer in the struct too. I just stopped here to minimize the diffs. 4. Similarly, I stopped short of messing with the pass manager layer. I have another question that could wait for the follow-up: why is the new pass manager creating the pass with LateSimplifyCFG set to true no matter where in the pipeline it's creating SimplifyCFG passes? // Create an early function pass manager to cleanup the output of the // frontend. EarlyFPM.addPass(SimplifyCFGPass()); --> /// \brief Construct a pass with the default thresholds /// and switch optimizations. SimplifyCFGPass::SimplifyCFGPass() : BonusInstThreshold(UserBonusInstThreshold), LateSimplifyCFG(true) {} <-- switches get converted to lookup tables and loops may not be in canonical form If this is unintended, then it's possible that the current behavior of dropping the 'LateSimplifyCFG' setting via recursion was masking this bug. Differential Revision: https://reviews.llvm.org/D38138 llvm-svn: 314308
* [InlineCost] add visitSelectInst()Haicheng Wu2017-09-271-0/+82
| | | | | | | | | | InlineCost can understand Select IR now. This patch finds free Select IRs and continue the propagation of SimplifiedValues, ConstantOffsetPtrs, and SROAArgValues. Differential Revision: https://reviews.llvm.org/D37198 llvm-svn: 314307
* Typo: const MCSchedModel SchedModel -> const MCSchedModel &SchedModelKrzysztof Parzyszek2017-09-271-1/+1
| | | | llvm-svn: 314301
* [RegAllocGreedy] Fix spelling error, "inteference" -> "interference", NFCMikael Holmen2017-09-271-3/+3
| | | | llvm-svn: 314299
* [PowerPC] eliminate unconditional branch to the next instructionHiroshi Inoue2017-09-271-0/+14
| | | | | | | | | This patch makes analyzeBranch eliminate unconditional branch to the next instruction. After basic blocks are re-organized by optimizers, such as machine block placement, a BB may end with an unconditional branch to the next (fallthrough) BB. This patch removes such redundant branch instruction. Differential Revision: https://reviews.llvm.org/D37730 llvm-svn: 314297
* [Misched]: Remove double call getMicroOpFactor.NFC.Javed Absar2017-09-271-1/+1
| | | | | | | Reviewed by: @MatzeB Differential Revision: https://reviews.llvm.org/D38176 llvm-svn: 314296
* [X86][AsmParser] fix PR32035Coby Tayree2017-09-271-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D37473 llvm-svn: 314295
* [X86][AVX] Improve (i4 bitcast (v4i1 x)) handling for 256-bit vector compare ↵Simon Pilgrim2017-09-271-1/+1
| | | | | | | | results. As commented on D37849 and rL313547, AVX1 targets were missing a chance to use vmovmskpd for v4f64/v4i64 results for bool vector bitcasts llvm-svn: 314293
* [dwarfdump] Fix printing of .debug_line offset.Jonas Devlieghere2017-09-271-1/+1
| | | | | | | | | Fixes 32-bit buildbots: http://bb.pgr.jp/builders/test-llvm-i686-linux-RA/builds/542 http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15/builds/11533 http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/11494 llvm-svn: 314291
* [dwarfdump] Add support for -debug-line=OFFSETJonas Devlieghere2017-09-271-0/+3
| | | | | | | | This patch adds support for passing an offset to -debug-line. Differential revision: https://reviews.llvm.org/D38240 llvm-svn: 314288
* [dwarfdump] Add support for -debug-loc=OFFSETJonas Devlieghere2017-09-272-6/+48
| | | | | | | | This patch adds support for passing an offset to -debug-loc. Differential revision: https://reviews.llvm.org/D38237 llvm-svn: 314286
* [ARM] isTruncateFree fixSam Parker2017-09-271-6/+6
| | | | | | | | | | I implemented isTruncateFree in rL313533, this patch fixes the logic to match my comment, as the previous logic was too general. Now the only truncates that are free are i64 -> i32. Differential Revision: https://reviews.llvm.org/D38234 llvm-svn: 314280
* [XRay] initialize all members of YAMLXRayRecord for -Wmissing-field-initializersMartin Pelikan2017-09-271-1/+1
| | | | llvm-svn: 314278
* [X86] Fix SJLJ struct offsets for x86_64Martin Storsjo2017-09-271-2/+2
| | | | | | | | | This is necessary, but not sufficient, for having working SJLJ exception handling on x86_64. Differential Revision: https://reviews.llvm.org/D38254 llvm-svn: 314277
* [X86] Remove erroneous callsite offsetting in SJLJ landing padsMartin Storsjo2017-09-271-6/+2
| | | | | | | | | | | | | | | | | | The callsite value is already stored indexed from 0 in the _Unwind_Context struct. When accessed via the functions _Unwind_GetIP and _Unwind_SetIP, the value is indexed from 1, but those functions handle the offseting. When reading directly from the struct here, we shouldn't subtract 1. This matches the code generated by the ARM target, where SJLJ exception handling is used by default on iOS. This makes clang-built object files for 32 bit x86 mingw work when linked with libgcc/libstdc++. Differential Revision: https://reviews.llvm.org/D38251 llvm-svn: 314276
* [X86] Use extract128BitVector in LowerMULH so we can extract from constant ↵Craig Topper2017-09-271-5/+6
| | | | | | build vectors. llvm-svn: 314274
* MemorySSAUpdater: Only add phis to insertedphis if we actually inserted ↵Daniel Berlin2017-09-271-3/+2
| | | | | | them, not if we just found existing ones llvm-svn: 314273
* [SelectionDAG] Make NewSDValueDbgMsg print target specific nodes correctly ↵Craig Topper2017-09-271-12/+12
| | | | | | by passing in the SelectionDAG. llvm-svn: 314271
* [XRay] convert FDR arg1 log entriesMartin Pelikan2017-09-271-2/+27
| | | | | | | | | | | | | | | | | | | | Summary: A new FDR metadata record will support logging a function call argument; appending multiple metadata records will represent a sequence of arguments meaning that "holes" are not representable by the buffer format. Each call argument is currently a 64-bit value (useful for "this" pointers and synchronization objects). If present, we put this argument to the function call "entry" record it belongs to, and alter its type to notify the user of its presence. Reviewers: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32840 llvm-svn: 314269
* [SimplifyIndVar] Constant fold IV usersHongbin Zheng2017-09-271-0/+30
| | | | | | | | | | | | | | | | | | | | | This patch tries to transform cases like: for (unsigned i = 0; i < N; i += 2) { bool c0 = (i & 0x1) == 0; bool c1 = ((i + 1) & 0x1) == 1; } To for (unsigned i = 0; i < N; i += 2) { bool c0 = true; bool c1 = true; } This commit also update test/Transforms/IndVarSimplify/replace-srem-by-urem.ll to prevent constant folding. Differential Revision: https://reviews.llvm.org/D38272 llvm-svn: 314266
* [BypassSlowDivision] Improve our handling of divisions by constantsSanjoy Das2017-09-261-7/+13
| | | | | | | | | | | | | | | Summary: Don't bail out on constant divisors for divisions that can be narrowed without introducing control flow . This gives us a 32 bit multiply instead of an emulated 64 bit multiply in the generated PTX assembly. Reviewers: jlebar Subscribers: jholewinski, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38265 llvm-svn: 314253
* [AArch64][Falkor] Fix bug in falkor prefetcher fix pass.Geoff Berry2017-09-261-3/+8
| | | | | | | | | | | | | | | Summary: In rare cases, loads that don't get prefetched that were marked as strided loads could cause a crash if they occurred in a loop with other colliding loads. Reviewers: mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D38261 llvm-svn: 314252
* [AArch64][Falkor] Fix correctness bug in falkor prefetcher fix pass and ↵Geoff Berry2017-09-261-49/+52
| | | | | | | | | | | | | | | | | | | | | correct some opcode tag computations. Summary: This addresses a correctness bug for LD[1234]*_POST opcodes that have the prefetcher fix applied to them: the base register was not being written back from the temp after being incremented, so it would appear to never be incremented. Also, fix some opcode tag computations based on some updated HW details to get better tag avoidance and thus better prefetcher performance. Reviewers: mcrosier Subscribers: aemerson, rengolin, javed.absar, kristof.beyls Differential Revision: https://reviews.llvm.org/D38256 llvm-svn: 314251
* [X86] Fix register class name in a comment. NFCCraig Topper2017-09-261-1/+1
| | | | llvm-svn: 314250
* Recommit r314151 "[X86] Make all the NOREX CodeGenOnly instructions into ↵Craig Topper2017-09-264-27/+37
| | | | | | | | postRA pseudos like the NOREX version of TEST."" The late MOV8rr_NOREX that caused the crash has been removed. llvm-svn: 314249
* [X86] Don't emit X86::MOV8rr_NOREX from X86InstrInfo::copyPhysReg.Craig Topper2017-09-261-7/+5
| | | | | | This hook is called after register allocation with two physical registers. We don't need a separate instruction at that time to force register class constraints. I left in the assert though. We also have a fatal error in X86MCCodeEmitter if we ever encode an H-reg and a REX prefix. llvm-svn: 314248
* [X86] Fix typo in comment. NFCCraig Topper2017-09-261-1/+1
| | | | llvm-svn: 314247
* [WebAssembly] Model weakly defined symbols as wasm exportsSam Clegg2017-09-262-14/+7
| | | | | | | | | | | | | Previously these were being included as both imports and exports, with the import being satisfied by the export (or some strong symbol) at runtime. However proved unnecessary and actually complicated linking as it meant there was not a 1-to-1 mapping between a wasm function /global index and a linker symbol. Differential Revision: https://reviews.llvm.org/D38246 llvm-svn: 314245
* [PowerPC] Reverting sequence of patches for elimination of comparison ↵Nemanja Ivanovic2017-09-261-1061/+0
| | | | | | | | | | | | | | | | | | | | | | instructions In the past while, I've committed a number of patches in the PowerPC back end aimed at eliminating comparison instructions. However, this causes some failures in proprietary source and these issues are not observed in SPEC or any open source packages I've been able to run. As a result, I'm pulling the entire series and will refactor it to: - Have a single entry point for easy control - Have fine-grained control over which patterns we transform A side-effect of this is that test cases for these patches (and modified by them) are XFAIL-ed. This is a temporary measure as it is counter-productive to remove/modify these test cases and then have to modify them again when the refactored patch is recommitted. The failure will be investigated in parallel to the refactoring effort and the recommit will either have a fix for it or will leave this transformation off by default until the problem is resolved. llvm-svn: 314244
* [X86][LLVM]Expanding Supports lowerInterleavedStore() in ↵Michael Zuckerman2017-09-261-3/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | X86InterleavedAccess (VF{8|16|32} stride 3) This patch expands the support of lowerInterleavedStore to {8|16|32}x8i stride 3. LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) . This patch is part two of two patches and it covers the store (interlevaed) side. The patch goal is to optimize the following sequence: a0 a1 a2 a3 a4 a5 a6 a7 b0 b1 b2 b3 b4 b5 b6 b7 c0 c1 c2 c3 c4 c5 c6 c7 into a0 b0 c0 a1 b1 c1 a2 b2 c2 a3 b3 c3 a4 b4 c4 a5 b5 c5 a6 b6 c6 a7 b7 c7 Reviewers: zvi guyblank dorit Ayal Differential Revision: https://reviews.llvm.org/D37117 Change-Id: I56ced8bcbea809a37654060771911ade20246ccc llvm-svn: 314234
* [InstCombine] Remove one use restriction on the shift for calls to ↵Craig Topper2017-09-261-3/+3
| | | | | | | | | | | | foldICmpAndShift. If this transformation succeeds, we're going to remove our dependency on the shift by rewriting the and. So it doesn't matter how many uses the shift has. This distributes the one use check to other transforms in foldICmpAndConstConst that do need it. Differential Revision: https://reviews.llvm.org/D38206 llvm-svn: 314233
* [WebAssembly] Use function/global index space in WasmSymbolSam Clegg2017-09-261-11/+9
| | | | | | | | | | | | | | | | | | | It is useful for the symbol to contain the index of the function of global it represents in the function/global index space. For imports we also store the import index so that the linker can find, for example, the signature of the corresponding function, which is defined by the import In the long run we need to decide whether this API surface should be closer to binary (where imported functions are seperate) or the wasm spec (where the function index space is unified). Differential Revision: https://reviews.llvm.org/D38189 llvm-svn: 314230
* [NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins.Artem Belevich2017-09-264-0/+93
| | | | | | Differential Revision: https://reviews.llvm.org/D38191 llvm-svn: 314223
* [X86] Add support for v16i32 UMUL_LOHI/SMUL_LOHICraig Topper2017-09-261-17/+20
| | | | | | | | | | | | | | Summary: This patch extends the v8i32/v4i32 custom lowering to support v16i32 Reviewers: zvi, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38274 llvm-svn: 314221
* [Hexagon] Fix a typo: #ifndef DEBUG -> #ifndef NDEBUGKrzysztof Parzyszek2017-09-261-1/+1
| | | | llvm-svn: 314216
* [Hexagon] Fix initialization of HexagonSubtargetKrzysztof Parzyszek2017-09-262-38/+18
| | | | | | | Make sure that "initializeSubtargetDependencies" sets all members that InstrInfo and the like may depend on. llvm-svn: 314214
* [dwarfdump] Skip 'stripped' sectionsJonas Devlieghere2017-09-263-0/+12
| | | | | | | | | | | | | | | | | | | When dsymutil generates the companion file, its strips all unnecessary sections by omitting their body and setting the offset in their corresponding load command to zero. One such section is the .eh_frame section, as it contains runtime information rather than debug information and is part of the __TEXT segment. When reading this section, we would just read the number of bytes specified in the load command, starting from offset 0 (i.e. the beginning of the file). Rather than trying to parse this obviously invalid section, dwarfdump now skips this. Differential revision: https://reviews.llvm.org/D38135 llvm-svn: 314208
* [X86][XOP] Merge rotation opcodes with AVX512 equivalents. NFCI.Simon Pilgrim2017-09-265-26/+19
| | | | | | | | The XOP rotations act as ROTL with +ve values and ROTR with -ve values, which means that we can treat them all as ROTL with unsigned modulo. We already check that we're only trying to lower as ROTL for XOP rotations. Differential Revision: https://reviews.llvm.org/D37949 llvm-svn: 314207
* [DSE] Merge stores when the later store only writes to memory locations the ↵Sanjay Patel2017-09-261-3/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | early store also wrote to (2nd try) This is a 2nd attempt at: https://reviews.llvm.org/rL310055 ...which was reverted at rL310123 because of PR34074: https://bugs.llvm.org/show_bug.cgi?id=34074 In this version, we break out of the inner loop after we successfully merge and kill a pair of stores. In the earlier rev, we were continuing instead, which meant we could process the invalid info from a now dead store. Original commit message (authored by Filipe Cabecinhas): This fixes PR31777. If both stores' values are ConstantInt, we merge the two stores (shifting the smaller store appropriately) and replace the earlier (and larger) store with an updated constant. In the future we should also support vectors of integers. And maybe float/double if we can. Differential Revision: https://reviews.llvm.org/D30703 llvm-svn: 314206
* [x86] fix pr29061Coby Tayree2017-09-261-6/+8
| | | | | | | | | | https://bugs.llvm.org//show_bug.cgi?id=29061 Don't try referencing REX-needed regs when not on 64bit mode Aligns to GCC Differetial Revision: https://reviews.llvm.org/D37801 llvm-svn: 314203
* Don't move llvm.localescape outside the entry block in the GCOV profiling passSylvestre Ledru2017-09-261-1/+11
| | | | | | | | | | | | | | | | | Summary: This fixes https://bugs.llvm.org/show_bug.cgi?id=34714. Patch by Marco Castelluccio Reviewers: rnk Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38224 llvm-svn: 314201
* Revert "[X86] Make all the NOREX CodeGenOnly instructions into postRA ↵Benjamin Kramer2017-09-264-37/+27
| | | | | | | | pseudos like the NOREX version of TEST." Makes llc crash. This reverts commit r314151. llvm-svn: 314199
* [X86] Finishing broadcastf32x2 and broadcasti32x2 intrinsics lowering to IR. ↵Uriel Korach2017-09-262-18/+2
| | | | | | | | | | | | llvm side. Removing X86 broadcast(f/i)32x2 intrinsics from llvm. Adding autoUpgrade support. Moving matching tests from avx512dq-intrinsics.ll to avx512dq-intrinsics-upgrade.ll and from avx512dqvl-intrinsics.ll to avx512dqvl-intrinsics-upgrade.ll. Differential Revision: https://reviews.llvm.org/D38220 llvm-svn: 314195
* TargetLibraryInfo: Stop guessing wchar_t sizeMatthias Braun2017-09-262-10/+4
| | | | | | | | | | | | | | Usually the frontend communicates the size of wchar_t via metadata and we can optimize wcslen (and possibly other calls in the future). In cases without the wchar_size metadata we would previously try to guess the correct size based on the target triple; however this is fragile to keep up to date and may miss users manually changing the size via flags. Better be safe and stop guessing and optimizing if the frontend didn't communicate the size. Differential Revision: https://reviews.llvm.org/D38106 llvm-svn: 314185
* [AVR] Prefer BasicBlock::getIterator over Function::begin()Dylan McKay2017-09-261-1/+1
| | | | | | Thanks to Eli Friedman for the suggestion. llvm-svn: 314182
* [AVR] When lowering shifts into loops, put newly generated MBBs in the sameDylan McKay2017-09-261-2/+4
| | | | | | | | | | | spot as the original MBB Discovered in avr-rust/rust#62 https://github.com/avr-rust/rust/issues/62 Patch by Gergo Erdi. llvm-svn: 314180
* [AVR] Use 1-byte alignment for all data typesDylan McKay2017-09-261-1/+1
| | | | | | | | | | | | | | This was an oversight in the original backend data layout. The AVR architecture does not have the concept of unaligned loads - all loads/stores from all addresses are aligned to one byte. Discovered in avr-rust issue #64 https://github.com/avr-rust/rust/issues/64 Patch By Gergo Erdi. llvm-svn: 314179
* Add section headers to SpecialCaseListsVlad Tsyrklevich2017-09-252-80/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Sanitizer blacklist entries currently apply to all sanitizers--there is no way to specify that an entry should only apply to a specific sanitizer. This is important for Control Flow Integrity since there are several different CFI modes that can be enabled at once. For maximum security, CFI blacklist entries should be scoped to only the specific CFI mode(s) that entry applies to. Adding section headers to SpecialCaseLists allows users to specify more information about list entries, like sanitizer names or other metadata, like so: [section1] fun:*fun1* [section2|section3] fun:*fun23* The section headers are regular expressions. For backwards compatbility, blacklist entries entered before a section header are put into the '[*]' section so that blacklists without sections retain the same behavior. SpecialCaseList has been modified to also accept a section name when matching against the blacklist. It has also been modified so the follow-up change to clang can define a derived class that allows matching sections by SectionMask instead of by string. Reviewers: pcc, kcc, eugenis, vsk Reviewed By: eugenis, vsk Subscribers: vitalybuka, llvm-commits Differential Revision: https://reviews.llvm.org/D37924 llvm-svn: 314170
* Revert r312724 ("[ARM] Remove redundant vcvt patterns.").Eli Friedman2017-09-251-0/+14
| | | | | | | | | | | | | | It leads to some improvements, but also a regression for the simple case, so it's not clearly a good idea. test/CodeGen/ARM/vcvt.ll now has test coverage to show the difference. Ultimately, the right solution is probably to custom-lower fp-to-int conversions, to something like ARMISD::VCVT_F32_S32 plus a bitcast. It's hard to do the right thing when the implicit bitcast isn't visible to DAG transforms. llvm-svn: 314169
* X86: remove R12 from CSR on Windows x64 SwiftCCSaleem Abdulrasool2017-09-252-20/+21
| | | | | | | | R12 is used for the SwiftError parameter. It is no longer a CSR as it is used for transfer the SwiftError, and the caller must preserve it if they need to. llvm-svn: 314165
OpenPOWER on IntegriCloud