summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Change 32 and 64 bit versions of LSL instruction have a 16-bit memory ↵Craig Topper2018-02-151-4/+6
| | | | | | | | operand. This matches the Intel and AMD documentation and is consistent with the LAR instruction. llvm-svn: 325197
* [X86] Dont' allow 'outs' and 'ins' in at&t syntax without suffixes.Craig Topper2018-02-141-6/+6
| | | | | | The match would be ambiguous, but at&t asm parsing doesn't support ambiguous matches and will just return the first. llvm-svn: 325192
* [X86] Reverse the operand order of invlpga in at&t syntax to match gas.Craig Topper2018-02-141-2/+2
| | | | llvm-svn: 325190
* [InstCombine] clean up fold for X / C -> X * (1.0/C); NFCISanjay Patel2018-02-141-34/+27
| | | | | | This should work with vector constants too, but it's currently limited to scalar. llvm-svn: 325187
* [ThinLTO/CFI] Include TYPE_ID summaries into GLOBALVAL_SUMMARY_BLOCKVitaly Buka2018-02-142-0/+108
| | | | | | | | | | | | | | Summary: TypeID summaries are used by CFI and need to be serialized by ThinLTO indexing for later use by LTO Backend. Reviewers: tejohnson, pcc Subscribers: mehdi_amini, inglorion, eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42611 llvm-svn: 325182
* [ORC] Consolidate RTDyldObjectLinkingLayer GetMemMgr and GetResolver into aLang Hames2018-02-142-7/+8
| | | | | | | | | unified GetResources callback. Having a single 'GetResources' callback will simplify adding new resources in the future. llvm-svn: 325180
* [ORC] Switch to shared_ptr ownership for AsynchronousSymbolQueries.Lang Hames2018-02-145-35/+39
| | | | | | | Queries need to stay alive until each owner has set the values they are responsible for. llvm-svn: 325179
* [X86] Don't swap argument on BOUND instruction in at&t syntax.Craig Topper2018-02-141-2/+3
| | | | | | | | | | | | The bound instruction does not have reversed operands in gas. Fixes PR27653. Patch by Maya Madhavan. Differential Revision: https://reviews.llvm.org/D43243 llvm-svn: 325178
* [Hexagon] Split HVX vector pair loads/stores, expand unaligned loadsKrzysztof Parzyszek2018-02-145-85/+222
| | | | llvm-svn: 325169
* [CodeGen] Print predecessors, successors, then liveins in -debug printingFrancis Visoiu Mistrih2018-02-141-17/+18
| | | | | | | | Reorder them to match MIR. Predecessors are only comments, and they're not usually printed in MIR. llvm-svn: 325166
* GlobalISel: Add templated functions and pattern matcher support for some ↵Volkan Keles2018-02-141-11/+7
| | | | | | | | | | | | | | | | | | more opcodes Summary: This patch adds templated functions to MachineIRBuilder for some opcodes and adds pattern matcher support for G_AND and G_OR. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D43309 llvm-svn: 325162
* Pass a module reference to CloneModule.Rafael Espindola2018-02-143-19/+20
| | | | | | | It can never be null and most callers were already using references or std::unique_ptr. llvm-svn: 325160
* Pass a reference to a module to the bitcode writer.Rafael Espindola2018-02-149-38/+36
| | | | | | | This simplifies most callers as they are already using references or std::unique_ptr. llvm-svn: 325155
* [RegisterClassInfo] Invalidate the register pressure set limit cache when ↵Craig Topper2018-02-141-4/+5
| | | | | | | | | | | | | | reserved regs or callee saved regs change Previously we only invalidated the pressure set limit cached when the TargetRegisterInfo pointer changes. But as reserved regs and callee saved regs are used as part of calculating the limits we should invalidate when those change too. I encountered this when reverting a patch from the 6.0 branch. One of the x86 test files had a function that used rbp as a frame pointer, making it reserved. It was followed by another function which didn't use rbp but had the same TRI so the pressure set limit cache was not invalidated. If i removed the function that used rbp as a frame pointer from the file, the remaining function then got a different register pressure limit for the GR16 pressure set. This caused the machine scheduler to change the scheduling for the function. This was an unexpected change from just deleting a function. I don't have a test case for trunk because the particular x86 test case is different enough from the 6.0 branch to not be affected now. Differential Revision: https://reviews.llvm.org/D43274 llvm-svn: 325153
* Move llvm::computeLoopSafetyInfo from LICM.cpp to LoopUtils.cpp. NFCDavid Green2018-02-142-37/+37
| | | | | | | | | Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h, into the corresponding LoopUtils.cpp, as opposed to LICM where it resides at the moment. This will allow other functions from Transforms/Utils to reference it. llvm-svn: 325151
* [X86][SSE] truncateVectorWithPACK - Use src type instead of dst to select ↵Simon Pilgrim2018-02-141-1/+1
| | | | | | | | | | between PACK*SDW/PACK*SWB Try to keep PACK*SDW/PACK*SWB as wide as possible, this helps ComputeNumSignBits as it can only peek through bitcasts to wider types, pre-AVX2 codegen was already doing this as it could peek through bitcasts/subvectors more easily than AVX2 could through shuffles. This shouldn't affect existing results as calls to truncateVectorWithPACK ensure we have enough sign bits to pack to the same value, but it should make it possible to use truncateVectorWithPACK chains to perform saturation in combineTruncateWithSat with a future patch. llvm-svn: 325149
* [InstCombine] Don't fold select(C, Z, binop(select(C, X, Y), W)) -> ↵Craig Topper2018-02-141-2/+17
| | | | | | | | | | | | select(C, Z, binop(Y, W)) if the binop is rem or div. The select may have been preventing a division by zero or INT_MIN/-1 so removing it might not be safe. Fixes PR36362. Differential Revision: https://reviews.llvm.org/D43276 llvm-svn: 325148
* [AMDGPU] Remove non-temporal flag from argument loadsStanislav Mekhanoshin2018-02-141-1/+0
| | | | | | | | | Kernel arguments likely read by all workitems and should not bypass cache. Fixes performance hit in sub-dword argument loads. Differential Revision: https://reviews.llvm.org/D43249 llvm-svn: 325146
* [DWARF] Fix incorrect prologue end line record.Paul Robinson2018-02-144-10/+12
| | | | | | | | | | | | The prologue-end line record must be emitted after the last instruction that is part of the function frame setup code and before the instruction that marks the beginning of the function body. Patch by Carlos Alberto Enciso! Differential Revision: https://reviews.llvm.org/D41762 llvm-svn: 325143
* [InstCombine] simplify isFMulOrFDivWithConstant(); NFCISanjay Patel2018-02-141-15/+7
| | | | llvm-svn: 325142
* [InstCombine] replace isa/cast with dyn_cast; NFCSanjay Patel2018-02-141-3/+2
| | | | llvm-svn: 325141
* [InstCombine] refactor folds for mul with negated operands; NFCISanjay Patel2018-02-141-10/+14
| | | | | | | This keeps with our current usage of 'match' and is easier to see that the optional NSW only applies in the non-constant operand case. llvm-svn: 325140
* Store defined macros in MCContext.Rafael Espindola2018-02-141-51/+5
| | | | | | | | | | | | | | | So that macros defined in inline assembly blocks are available to the whole file. This provides a consistent behavior with other assembly directives, since equations for example are already preserved between inline assembly blocks. PR: 36110 Patch by Roger! llvm-svn: 325139
* [SelectionDAG][X86] Fix incorrect offset generated for VMASKMOVAlexander Ivchenko2018-02-142-18/+21
| | | | | | | | When creating high MachineMemOperand for MSTORE/MLOAD we supply it with the original PointerInfo, while the pointer itself had been incremented. The patch adds the proper offset to the PointerInfo. llvm-svn: 325135
* [SLP] Allow vectorization of reversed loads.Alexey Bataev2018-02-141-6/+20
| | | | | | | | | | | | | | Summary: Reversed loads are handled as gathering. But we can just reshuffle these values. Patch adds support for vectorization of reversed loads. Reviewers: RKSimon, spatel, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43022 llvm-svn: 325134
* [ARM] f16 stack spill/reloadsSjoerd Meijer2018-02-141-1/+21
| | | | | | | | This adds support for handling f16 stack spills/reloads. Differential Revision: https://reviews.llvm.org/D43280 llvm-svn: 325130
* Fix GCC -Wlogical-op-parentheses warning. NFCI.Simon Pilgrim2018-02-141-2/+2
| | | | llvm-svn: 325129
* [X86] Reduce Store Forward Block issues in HW - Recommit after fixing Bug 36346Lama Saba2018-02-144-0/+585
| | | | | | | | | | | | | | | If a load follows a store and reloads data that the store has written to memory, Intel microarchitectures can in many cases forward the data directly from the store to the load, This "store forwarding" saves cycles by enabling the load to directly obtain the data instead of accessing the data from cache or memory. A "store forward block" occurs in cases that a store cannot be forwarded to the load. The most typical case of store forward block on Intel Core microarchiticutre that a small store cannot be forwarded to a large load. The estimated penalty for a store forward block is ~13 cycles. This pass tries to recognize and handle cases where "store forward block" is created by the compiler when lowering memcpy calls to a sequence of a load and a store. The pass currently only handles cases where memcpy is lowered to XMM/YMM registers, it tries to break the memcpy into smaller copies. breaking the memcpy should be possible since there is no atomicity guarantee for loads and stores to XMM/YMM. Change-Id: Ic41aa9ade6512e0478db66e07e2fde41b4fb35f9 llvm-svn: 325128
* [X86][SSE] Relax type legality for combineTruncateWithSat PACKSS/PACKUS ↵Simon Pilgrim2018-02-141-4/+4
| | | | | | | | truncation While the AVX512 VTRUNCS/VTRUNCUS instructions require legal types, truncateVectorWithPACK handles cases with multiples of legal types through splitting/concatenation. So we just need to ensure that the src/dst scalar types are correct and leave truncateVectorWithPACK to handle the rest of it. llvm-svn: 325127
* Recommit r325001: [CallSiteSplitting] Support splitting of blocks with ↵Florian Hahn2018-02-141-22/+77
| | | | | | | | | | | | | | | | | | | | instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325126
* [LoopInterchange] Incrementally update the dominator tree.Florian Hahn2018-02-141-34/+40
| | | | | | | | | | | | | We can use incremental dominator tree updates to avoid re-calculating the dominator tree after interchanging 2 loops. Reviewers: dmgreen, kuhar Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D43176 llvm-svn: 325122
* [Utils] Salvage the debug info of DCE'ed 'and' instructionsPetar Jovanovic2018-02-143-0/+5
| | | | | | | | | | Preserve debug info from a dead 'and' instruction with a constant. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D43163 llvm-svn: 325119
* Revert r325107 (case folding DJB hash) and subsequent build fixPavel Labath2018-02-143-820/+1
| | | | | | | The "knownValuesUnicode" test in the patch fails on ppc64 and arm64 bots. Reverting while I investigate. llvm-svn: 325115
* [IRMover] Move type name extraction to a separate function. NFCEugene Leviant2018-02-141-6/+11
| | | | llvm-svn: 325110
* Implement a case-folding version of DJB hashPavel Labath2018-02-143-1/+820
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements a variant of the DJB hash function which folds the input according to the algorithm in the Dwarf 5 specification (Section 6.1.1.4.5), which in turn references the Unicode Standard (Section 5.18, "Case Mappings"). To achieve this, I have added a llvm::sys::unicode::foldCharSimple function, which performs this mapping. The implementation of this function was generated from the CaseMatching.txt file from the Unicode spec using a python script (which is also included in this patch). The script tries to optimize the function by coalescing adjecant mappings with the same shift and stride (terms I made up). Theoretically, it could be made a bit smarter and merge adjecant blocks that were interrupted by only one or two characters with exceptional mapping, but this would save only a couple of branches, while it would greatly complicate the implementation, so I deemed it was not worth it. Since we assume that the vast majority of the input characters will be US-ASCII, the folding hash function has a fast-path for handling these, and only whips out the full decode+fold+encode logic if we encounter a character outside of this range. It might be possible to implement the folding directly on utf8 sequences, but this would also bring a lot of complexity for the few cases where we will actually need to process non-ascii characters. Reviewers: JDevlieghere, aprantl, probinson, dblaikie Subscribers: mgorny, hintonda, echristo, clayborg, vleschuk, llvm-commits Differential Revision: https://reviews.llvm.org/D42740 llvm-svn: 325107
* Adding a width of the GEP index to the Data Layout.Elena Demikhovsky2018-02-1420-103/+162
| | | | | | | | | | | | | | | | | | Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout. p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits. The index size parameter is optional, if not specified, it is equal to the pointer size. Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width. It works fine if you can convert pointer to integer for address calculation and all registered targets do this. But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout. http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account. This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size. Differential Revision: https://reviews.llvm.org/D42123 llvm-svn: 325102
* [SelectionDAG] Remove duplicate code from TargetLowering::SimplifySetCC.Craig Topper2018-02-141-4/+0
| | | | | | This exact code already exists a little further up. llvm-svn: 325101
* [X86] Remove dead code from retpoline thunk generationReid Kleckner2018-02-141-26/+0
| | | | | | Follow-up to r325049 llvm-svn: 325085
* Fix off-by-one in set_thread_name which causes truncation to fail on LinuxSam McCall2018-02-131-1/+2
| | | | llvm-svn: 325069
* [globalisel][legalizerinfo] Follow up on post-commit review comments after ↵Daniel Sanders2018-02-132-1/+4
| | | | | | | | | | | | | | r323681 * Document most API's * Delete a useless function call * Fix a discrepancy between the single and multi-opcode variants of getActionDefinitions(). The multi-opcode variant now requires that more than one opcode is requested. Previously it acted much like the single-opcode form but unnecessarily enforced the requirements of the multi-opcode form. llvm-svn: 325067
* [GVN] Salvage debug info from dead instsVedant Kumar2018-02-131-0/+2
| | | | | | | | | | This preserves an additional 581 unique source variables in a stage2 build of clang (according to `llvm-dwarfdump --statistics`). It increases the size of the .debug_loc section by 0.1% (or 87139 bytes). Differential Revision: https://reviews.llvm.org/D43255 llvm-svn: 325063
* [InstCombine] (lshr X, 31) * Y --> (ashr X, 31) & YSanjay Patel2018-02-131-25/+13
| | | | | | | | | | | This replaces the bit-tracking based fold that did the same thing, but it only worked for scalars and not directly. There is no evidence in existing regression tests that the greater power of bit-tracking was needed here, but we should be aware of this potential loss of optimization. llvm-svn: 325062
* [X86] Use EDI for retpoline when no scratch regs are leftReid Kleckner2018-02-132-63/+29
| | | | | | | | | | | | | | | | | | | | Summary: Instead of solving the hard problem of how to pass the callee to the indirect jump thunk without a register, just use a CSR. At a call boundary, there's nothing stopping us from using a CSR to hold the callee as long as we save and restore it in the prologue. Also, add tests for this mregparm=3 case. I wrote execution tests for __llvm_retpoline_push, but they never got committed as lit tests, either because I never rewrote them or because they got lost in merge conflicts. Reviewers: chandlerc, dwmw2 Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D43214 llvm-svn: 325049
* [InstCombine] (bool X) * Y --> X ? Y : 0Sanjay Patel2018-02-131-0/+9
| | | | | | | | | This is both a functional improvement for vectors and an efficiency improvement for scalars. The existing code below the new folds does the same thing for scalars, but in an indirect and expensive way. llvm-svn: 325048
* Document the shortcomings of DwarfExpression::addMachineReg().Adrian Prantl2018-02-131-4/+12
| | | | | | | | | | | | | Also make a drive-by-fix of a bug in the subregister scan code that only triggers with an incomplete or otherwise very irregular machine description. rdar://problem/37404493 This re-applies r324972 with an early exit in the case of a complete failure to make this commit NFC again as intended. llvm-svn: 325041
* [DeadStoreElimination] Salvage debug info from dead instsVedant Kumar2018-02-131-0/+3
| | | | | | | | | | According to `llvm-dwarfdump --statistics` this salvages 43 additional unique source variables in a stage2 build of clang. It increases the size of the .debug_loc section by 0.002% (or 2864 bytes). Differential Revision: https://reviews.llvm.org/D43220 llvm-svn: 325035
* Revert r324903 "[AArch64] Refactor identification of SIMD immediates"Hans Wennborg2018-02-131-280/+368
| | | | | | | | | | | It caused "Cannot select: t33: f64 = AArch64ISD::FMOV Constant:i32<0>" in Chromium builds. See PR36369. > Get rid of icky goto loops and make the code easier to maintain (NFC). > > Differential revision: https://reviews.llvm.org/D42723 llvm-svn: 325034
* [CodeGen] Print bundled instructions using the MIR syntax in -debug outputFrancis Visoiu Mistrih2018-02-131-7/+23
| | | | | | | | | | | | | | | | | Old syntax: BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 * %r0 = SOME_OP %r2 * %r1 = ANOTHER_OP internal %r0 New syntax: BUNDLE implicit-def %r0, implicit-def %r1, implicit %r2 { %r0 = SOME_OP %r2 %r1 = ANOTHER_OP internal %r0 } llvm-svn: 325032
* [AMDGPU] Change constant addr space to 4Yaxun Liu2018-02-136-9/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D43170 llvm-svn: 325030
* [DAGCombiner] Add one use check to fold (not (and x, y)) -> (or (not x), ↵Craig Topper2018-02-131-2/+2
| | | | | | | | | | | | | | | | | | | (not y)) Summary: If the and has an additional use we shouldn't invert it. That creates an additional instruction. While there add a one use check to the transform above that looked similar. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43225 llvm-svn: 325019
OpenPOWER on IntegriCloud