summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Store defined macros in MCContext.Rafael Espindola2018-02-141-51/+5
| | | | | | | | | | | | | | | So that macros defined in inline assembly blocks are available to the whole file. This provides a consistent behavior with other assembly directives, since equations for example are already preserved between inline assembly blocks. PR: 36110 Patch by Roger! llvm-svn: 325139
* [SelectionDAG][X86] Fix incorrect offset generated for VMASKMOVAlexander Ivchenko2018-02-142-18/+21
| | | | | | | | When creating high MachineMemOperand for MSTORE/MLOAD we supply it with the original PointerInfo, while the pointer itself had been incremented. The patch adds the proper offset to the PointerInfo. llvm-svn: 325135
* [SLP] Allow vectorization of reversed loads.Alexey Bataev2018-02-141-6/+20
| | | | | | | | | | | | | | Summary: Reversed loads are handled as gathering. But we can just reshuffle these values. Patch adds support for vectorization of reversed loads. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43022 llvm-svn: 325134
* [ARM] f16 stack spill/reloadsSjoerd Meijer2018-02-141-1/+21
| | | | | | | | This adds support for handling f16 stack spills/reloads. Differential Revision: https://reviews.llvm.org/D43280 llvm-svn: 325130
* Fix GCC -Wlogical-op-parentheses warning. NFCI.Simon Pilgrim2018-02-141-2/+2
| | | | llvm-svn: 325129
* [X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346Lama Saba2018-02-144-0/+585
| | | | | | | | | | | | | | | If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory. A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load. The estimated penalty for a store forward block is ~13 cycles. This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence of a load and a store. The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies. breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM. Change-Id: Ic41aa9ade6512e0478db66e07e2fde41b4fb35f9 llvm-svn: 325128
* [X86][SSE] Relax type legality for combineTruncateWithSat PACKSS/PACKUS ↵Simon Pilgrim2018-02-141-4/+4
| | | | | | | | truncation While the AVX512 VTRUNCS/VTRUNCUS instructions require legal types, truncateVectorWithPACK handles cases with multiples of legal types through splitting/concatenation. So we just need to ensure that the src/dst scalar types are correct and leave truncateVectorWithPACK to handle the rest of it. llvm-svn: 325127
* Recommit r325001: [CallSiteSplitting] Support splitting of blocks with ↵Florian Hahn2018-02-141-22/+77
| | | | | | | | | | | | | | | | | | | | instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325126
* [LoopInterchange] Incrementally update the dominator tree.Florian Hahn2018-02-141-34/+40
| | | | | | | | | | | | | We can use incremental dominator tree updates to avoid re-calculating the dominator tree after interchanging 2 loops. Reviewers: dmgreen, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D43176 llvm-svn: 325122
* [Utils] Salvage the debug info of DCE'ed 'and' instructionsPetar Jovanovic2018-02-143-0/+5
| | | | | | | | | | Preserve debug info from a dead 'and' instruction with a constant. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D43163 llvm-svn: 325119
* Revert r325107 (case folding DJB hash) and subsequent build fixPavel Labath2018-02-143-820/+1
| | | | | | | The "knownValuesUnicode" test in the patch fails on ppc64 and arm64 bots. Reverting while I investigate. llvm-svn: 325115
* [IRMover] Move type name extraction to a separate function. NFCEugene Leviant2018-02-141-6/+11
| | | | llvm-svn: 325110
* Implement a case-folding version of DJB hashPavel Labath2018-02-143-1/+820
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements a variant of the DJB hash function which folds the input according to the algorithm in the Dwarf 5 specification (Section 6.1.1.4.5), which in turn references the Unicode Standard (Section 5.18, "Case Mappings"). To achieve this, I have added a llvm::sys::unicode::foldCharSimple function, which performs this mapping. The implementation of this function was generated from the CaseMatching.txt file from the Unicode spec using a python script (which is also included in this patch). The script tries to optimize the function by coalescing adjecant mappings with the same shift and stride (terms I made up). Theoretically, it could be made a bit smarter and merge adjecant blocks that were interrupted by only one or two characters with exceptional mapping, but this would save only a couple of branches, while it would greatly complicate the implementation, so I deemed it was not worth it. Since we assume that the vast majority of the input characters will be US-ASCII, the folding hash function has a fast-path for handling these, and only whips out the full decode+fold+encode logic if we encounter a character outside of this range. It might be possible to implement the folding directly on utf8 sequences, but this would also bring a lot of complexity for the few cases where we will actually need to process non-ascii characters. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: mgorny, hintonda, echristo, clayborg, vleschuk, llvm-commits Differential Revision: https://reviews.llvm.org/D42740 llvm-svn: 325107
* Adding a width of the GEP index to the Data Layout.Elena Demikhovsky2018-02-1420-103/+162
| | | | | | | | | | | | | | | | | | Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102
* [SelectionDAG] Remove duplicate code from TargetLowering::SimplifySetCC.Craig Topper2018-02-141-4/+0
| | | | | | This exact code already exists a little further up. llvm-svn: 325101
* [X86] Remove dead code from retpoline thunk generationReid Kleckner2018-02-141-26/+0
| | | | | | Follow-up to r325049 llvm-svn: 325085
* Fix off-by-one in set_thread_name which causes truncation to fail on LinuxSam McCall2018-02-131-1/+2
| | | | llvm-svn: 325069
* [globalisel][legalizerinfo] Follow up on post-commit review comments after ↵Daniel Sanders2018-02-132-1/+4
| | | | | | | | | | | | | | r323681 * Document most API's * Delete a useless function call * Fix a discrepancy between the single and multi-opcode variants of getActionDefinitions(). The multi-opcode variant now requires that more than one opcode is requested. Previously it acted much like the single-opcode form but unnecessarily enforced the requirements of the multi-opcode form. llvm-svn: 325067
* [GVN] Salvage debug info from dead instsVedant Kumar2018-02-131-0/+2
| | | | | | | | | | This preserves an additional 581 unique source variables in a stage2 build of clang (according to `llvm-dwarfdump --statistics`). It increases the size of the .debug_loc section by 0.1% (or 87139 bytes). Differential Revision: https://reviews.llvm.org/D43255 llvm-svn: 325063
* [InstCombine] (lshr X, 31) * Y --> (ashr X, 31) & YSanjay Patel2018-02-131-25/+13
| | | | | | | | | | | This replaces the bit-tracking based fold that did the same thing, but it only worked for scalars and not directly. There is no evidence in existing regression tests that the greater power of bit-tracking was needed here, but we should be aware of this potential loss of optimization. llvm-svn: 325062
* [X86] Use EDI for retpoline when no scratch regs are leftReid Kleckner2018-02-132-63/+29
| | | | | | | | | | | | | | | | | | | | Summary: Instead of solving the hard problem of how to pass the callee to the indirect jump thunk without a register, just use a CSR. At a call boundary, there's nothing stopping us from using a CSR to hold the callee as long as we save and restore it in the prologue. Also, add tests for this mregparm=3 case. I wrote execution tests for __llvm_retpoline_push, but they never got committed as lit tests, either because I never rewrote them or because they got lost in merge conflicts. Reviewers: chandlerc, dwmw2 Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D43214 llvm-svn: 325049
* [InstCombine] (bool X) * Y --> X ? Y : 0Sanjay Patel2018-02-131-0/+9
| | | | | | | | | This is both a functional improvement for vectors and an efficiency improvement for scalars. The existing code below the new folds does the same thing for scalars, but in an indirect and expensive way. llvm-svn: 325048
* Document the shortcomings of DwarfExpression::addMachineReg().Adrian Prantl2018-02-131-4/+12
| | | | | | | | | | | | | Also make a drive-by-fix of a bug in the subregister scan code that only triggers with an incomplete or otherwise very irregular machine description. rdar://problem/37404493 This re-applies r324972 with an early exit in the case of a complete failure to make this commit NFC again as intended. llvm-svn: 325041
* [DeadStoreElimination] Salvage debug info from dead instsVedant Kumar2018-02-131-0/+3
| | | | | | | | | | According to `llvm-dwarfdump --statistics` this salvages 43 additional unique source variables in a stage2 build of clang. It increases the size of the .debug_loc section by 0.002% (or 2864 bytes). Differential Revision: https://reviews.llvm.org/D43220 llvm-svn: 325035
* Revert r324903 "[AArch64] Refactor identification of SIMD immediates"Hans Wennborg2018-02-131-280/+368
| | | | | | | | | | | It caused "Cannot select: t33: f64 = AArch64ISD::FMOV Constant:i32<0>" in Chromium builds. See PR36369. > Get rid of icky goto loops and make the code easier to maintain (NFC). > > Differential revision: https://reviews.llvm.org/D42723 llvm-svn: 325034
* [CodeGen] Print bundled instructions using the MIR syntax in -debug outputFrancis Visoiu Mistrih2018-02-131-7/+23
| | | | | | | | | | | | | | | | | Old syntax: BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 * %r0 = SOME_OP %r2 * %r1 = ANOTHER_OP internal %r0 New syntax: BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 { %r0 = SOME_OP %r2 %r1 = ANOTHER_OP internal %r0 } llvm-svn: 325032
* [AMDGPU] Change constant addr space to 4Yaxun Liu2018-02-136-9/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D43170 llvm-svn: 325030
* [DAGCombiner] Add one use check to fold (not (and x, y)) -> (or (not x), ↵Craig Topper2018-02-131-2/+2
| | | | | | | | | | | | | | | | | | | (not y)) Summary: If the and has an additional use we shouldn't invert it. That creates an additional instruction. While there add a one use check to the transform above that looked similar. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43225 llvm-svn: 325019
* [X86] Add combine to shrink 64-bit ands when one input is an any_extend and ↵Craig Topper2018-02-131-0/+14
| | | | | | | | | | | | | | | | the other input guarantees upper 32 bits are 0. Summary: This gets the shift case from PR35792. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43222 llvm-svn: 325018
* [Hexagon] Simplify some code, NFCKrzysztof Parzyszek2018-02-131-114/+44
| | | | llvm-svn: 325014
* [Hexagon] Remove unnecessary checkKrzysztof Parzyszek2018-02-131-2/+0
| | | | llvm-svn: 325013
* [ARM] Allow half types in ConstantPoolSjoerd Meijer2018-02-131-1/+9
| | | | | | | | | | | Change ARMConstantIslandPass to: - accept f16 literals as litpool entries, - if the litpool needs to be inserted in the middle of a big block, then we need to 4-byte align the next instruction in ARM mode. Differential Revision: https://reviews.llvm.org/D42784 llvm-svn: 325012
* [DAG] fix type of undef returned by getNode()Sanjay Patel2018-02-131-2/+2
| | | | | | | | The bug has been lying dormant, but apparently was never exposed, until after rL324941 because we didn't return the correct result for shifts with undef operands. llvm-svn: 325010
* Revert r325001: [CallSiteSplitting] Support splitting of blocks with instrs ↵Florian Hahn2018-02-131-81/+22
| | | | | | | | before call. Due to memsan not being happy with the array of ValueToValue maps. llvm-svn: 325009
* [IR] Fix creating mutable versions of TBAA access tagsIvan A. Kosarev2018-02-131-1/+1
| | | | | | | | | | Due to a typo in D41565, mutable TBAA tags created with createMutableTBAAAccessTag() lose their base types. This patch fixes that typo and updates tests respectively. Differential Revision: https://reviews.llvm.org/D42364 llvm-svn: 325008
* [CallSiteSplitting] Clear ValueToValue maps.Florian Hahn2018-02-131-0/+4
| | | | llvm-svn: 325006
* [CallSiteSplitting] Dereference pointer earlier.Florian Hahn2018-02-131-3/+3
| | | | | | This should make the sanitizers happy. llvm-svn: 325004
* [InstCombine] Simplify getLogBase2 case for scalar/splats. NFCI.Simon Pilgrim2018-02-131-3/+2
| | | | llvm-svn: 325003
* [CallSiteSplitting] Support splitting of blocks with instrs before call.Florian Hahn2018-02-131-22/+77
| | | | | | | | | | | | | | | | | | For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325001
* [ARM] Don't print "Requires NEON" error message for M-profileAndre Vieira2018-02-131-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D43125 llvm-svn: 325000
* [Thumb] Handle addressing mode AddrMode5FP16Sjoerd Meijer2018-02-131-0/+14
| | | | | | | | This addressing mode wasn't checked, so we were running in an assert. Differential Revision: https://reviews.llvm.org/D43179 llvm-svn: 324996
* [LoopInterchange] Check number of latch successors before accessing them.Florian Hahn2018-02-131-1/+1
| | | | | | | | | | | | | | | | In cases where the OuterMostLoopLatchBI only has a single successor, accessing the second successor will fail. This fixes a failure when building the test-suite with loop-interchange enabled. Reviewers: mcrosier, karthikthecool, davide Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D42906 llvm-svn: 324994
* [X86] Teach EVEX->VEX pass to turn VRNDSCALE into VROUND when bits 7:4 of ↵Craig Topper2018-02-131-2/+27
| | | | | | | | | | the immediate are 0 and the regular EVEX->VEX checks pass. Bits 7:4 control the scale part of the operation. If the scale is 0 the behavior is equivalent to VROUND. Fixes PR36246 llvm-svn: 324985
* [Utils] Salvage debug info from all no-op castsVedant Kumar2018-02-131-4/+7
| | | | | | | | | | | We already try to salvage debug values from no-op bitcasts and inttoptr instructions: we should handle ptrtoint instructions as well. This saves an additional 24,444 debug values in a stage2 build of clang, and (according to llvm-dwarfdump --statistics) provides an additional 289 unique source variables. llvm-svn: 324982
* Revert "Rewrite the cached map used for locating the most precise DIE among ↵David Blaikie2018-02-131-360/+26
| | | | | | | | | | | | | | | inlined subroutines for a given address." Seeing some inlining missing in internal uses of symbolizer. I'll work on a reproduction, tests, improvements & recommit as soon as possible. (Chandler would like it to be known that this improvement did make check-llvm 4x faster... - so there's certainly some fairly good motivation to push on fixing/figuring this out & getting it back in) This reverts commit r321345. llvm-svn: 324981
* [X86] Use getTypeAction in most places that were checking ↵Craig Topper2018-02-131-8/+9
| | | | | | | | ExperimentalVectorWideningLegalization. This will allow more flexibility in what types we legalize via widening or not. This should help with a couple lines in D41062. llvm-svn: 324980
* Revert "Document the shortcomings of DwarfExpression::addMachineReg()."Adrian Prantl2018-02-131-7/+3
| | | | | | | This reverts commit r324972. This commit broke a bot, so perhaps it is testable after all? llvm-svn: 324977
* [Utils] Salvage debug info of DCE'ed mul/sdiv/srem instructionsVedant Kumar2018-02-133-0/+13
| | | | | | | | | | | | | Here are the number of additional debug values salvaged in a stage2 build of clang: 63 SALVAGE: MUL 1250 SALVAGE: SDIV (No values were salvaged from `srem` instructions in this experiment, but it's a simple case to handle so we might as well.) llvm-svn: 324976
* [Utils] Salvage debug info of DCE'ed shl/lhsr/ashr instructionsVedant Kumar2018-02-133-0/+15
| | | | | | | | | | | Here are the number of additional debug values salvaged in a stage2 build of clang: 1912 SALVAGE: ASHR 405 SALVAGE: LSHR 249 SALVAGE: SHL llvm-svn: 324975
* [Utils] Salvage the debug info of DCE'ed 'sub' instructionsVedant Kumar2018-02-131-0/+3
| | | | | | This salvages 14 debug values in a stage2 build of clang. llvm-svn: 324974
OpenPOWER on IntegriCloud