summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[COFF] Use comdat shared constants for MinGW as well"Martin Storsjo2018-07-263-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit r337951. While that kind of shared constant generally works fine in a MinGW setting, it broke some cases of inline assembly that worked before: $ cat const-asm.c int MULH(int a, int b) { int rt, dummy; __asm__ ( "imull %3" :"=d"(rt), "=a"(dummy) :"a"(a), "rm"(b) ); return rt; } int func(int a) { return MULH(a, 1); } $ clang -target x86_64-win32-gnu -c const-asm.c -O2 const-asm.c:4:9: error: invalid variant '00000001' "imull %3" ^ <inline asm>:1:15: note: instantiated into assembly here imull __real@00000001(%rip) ^ A similar error is produced for i686 as well. The same test with a target of x86_64-win32-msvc or i686-win32-msvc works fine. llvm-svn: 338018
* [x86/SLH] Extract the logic to trace predicate state through calls toChandler Carruth2018-07-261-19/+39
| | | | | | | | | | a helper function with a nice overview comment. NFC. This is a preperatory refactoring to implementing another component of mitigation here that was descibed in the design document but hadn't been implemented yet. llvm-svn: 338016
* [AArch64] Armv8.2-A: add the crypto extensionsSjoerd Meijer2018-07-264-5/+203
| | | | | | | | | This adds MC support for the crypto instructions that were made optional extensions in Armv8.2-A (AArch64 only). Differential Revision: https://reviews.llvm.org/D49370 llvm-svn: 338010
* [X86] Don't use CombineTo to skip adding new nodes to the DAGCombiner ↵Craig Topper2018-07-261-5/+1
| | | | | | | | | | | | worklist in combineMul. I'm not sure if this was trying to avoid optimizing the new nodes further or what. Or maybe to prevent a cycle if something tried to reform the multiply? But I don't think its a reliable way to do that. If the user of the expanded multiply is visited by the DAGCombiner after this conversion happens, the DAGCombiner will check its operands, see that they haven't been visited by the DAGCombiner before and it will then add the first node to the worklist. This process will repeat until all the new nodes are visited. So this seems like an unreliable prevention at best. So this patch just returns the new nodes like any other combine. If this starts causing problems we can try to add target specific nodes or something to more directly prevent optimizations. Now that we handle the combine normally, we can combine any negates the mul expansion creates into their users since those will be visited now. llvm-svn: 338007
* Revert r337981: it breaks the debuginfo-testsAlex Lorenz2018-07-263-41/+51
| | | | | | | This commit caused a regression in the debuginfo-tests: FAIL: debuginfo-tests :: apple-accel.cpp (40748 of 46595) llvm-svn: 337997
* [X86] Remove some unnecessary explicit calls to DCI.AddToWorkList.Craig Topper2018-07-261-10/+0
| | | | | | These calls were making sure some newly created nodes were added to worklist, but the DAGCombiner has internal support for ensuring it has visited all nodes. Any time it visits a node it ensures the operands have been queued to be visited as well. This means if we only need to return the last new node. The DAGCombiner will take care of adding its inputs thus walking backwards through all the new nodes. llvm-svn: 337996
* [Support] Introduce createStringError helper functionVictor Leschuk2018-07-261-0/+4
| | | | | | | | | | | The function in question is copy-pasted lots of times in DWARF-related classes. Thus it will make sense to place its implementation into the Support library. Reviewed by: lhames Differential Revision: https://reviews.llvm.org/D49824 llvm-svn: 337995
* [GlobalISel] Fall back to SDISel for swifterror/swiftself attributes.Amara Emerson2018-07-262-0/+18
| | | | | | We don't currently support these, fall back until we do. llvm-svn: 337994
* [DWARF v5] Don't report an error when the .debug_rnglists section is empty ↵Wolfgang Pieb2018-07-261-16/+18
| | | | | | | | | | | or non-existent. Fixes PR38297. Reviewer: JDevlieghere Differential Revision: https://reviews.llvm.org/D49815 llvm-svn: 337993
* [LoadStoreVectorizer] Use const referenceFangrui Song2018-07-261-4/+6
| | | | llvm-svn: 337992
* RegUsageInfo: Cleanup; NFCMatthias Braun2018-07-264-72/+69
| | | | | | | | | - Remove unnecessary anchor function - Remove unnecessary override of getAnalysisUsage - Use reference instead of pointers where things cannot be nullptr - Use ArrayRef instead of std::vector where possible llvm-svn: 337989
* CodeGen.cpp: Sort initializers; NFCMatthias Braun2018-07-261-2/+2
| | | | llvm-svn: 337988
* CodeGen: Cleanup regmask construction; NFCMatthias Braun2018-07-265-11/+15
| | | | | | | | | - Avoid duplication of regmask size calculation. - Simplify allocateRegisterMask() call. - Rename allocateRegisterMask() to allocateRegMask() to be consistent with naming in MachineOperand. llvm-svn: 337986
* [DWARF v5] Don't emit multiple DW_AT_rnglists_base attributes. Some ↵Wolfgang Pieb2018-07-253-51/+41
| | | | | | | | | | | | refactoring of range lists emissions and added test cases. Reviewer: dblaikie Differential Revision: https://reviews.llvm.org/D49522 llvm-svn: 337981
* bpf: new option -bpf-expand-memcpy-in-order to expand memcpy in orderYonghong Song2018-07-259-8/+255
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some BPF JIT backends would want to optimize memcpy in their own architecture specific way. However, at the moment, there is no way for JIT backends to see memcpy semantics in a reliable way. This is due to LLVM BPF backend is expanding memcpy into load/store sequences and could possibly schedule them apart from each other further. So, BPF JIT backends inside kernel can't reliably recognize memcpy semantics by peephole BPF sequence. This patch introduce new intrinsic expand infrastructure to memcpy. To get stable in-order load/store sequence from memcpy, we first lower memcpy into BPF::MEMCPY node which then expanded into in-order load/store sequences in expandPostRAPseudo pass which will happen after instruction scheduling. By this way, kernel JIT backends could reliably recognize memcpy through scanning BPF sequence. This new memcpy expand infrastructure is gated by a new option: -bpf-expand-memcpy-in-order Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 337977
* [GlobalMerge] Handle llvm.compiler.used correctly.Eli Friedman2018-07-251-4/+5
| | | | | | | | Reuse the handling for llvm.used, and don't transform such globals. Fixes a failure on the asan buildbot caused by my previous commit. llvm-svn: 337973
* [SelectionDAG] try to convert funnel shift directly to rotate if legalSanjay Patel2018-07-251-1/+10
| | | | | | | | | | | If the DAGCombiner's rotate matching was working as expected, I don't think we'd see any test diffs here. This sidesteps the issue of custom lowering for rotates raised in PR38243: https://bugs.llvm.org/show_bug.cgi?id=38243 ...by only dealing with legal operations. llvm-svn: 337966
* [LSV] Look through selects for consecutive addressesRoman Tereshin2018-07-251-15/+62
| | | | | | | | | | | | | | | | | | | | | | | | In some cases LSV sees (load/store _ (select _ <pointer expression> <pointer expression>)) patterns in input IR, often due to sinking and other forms of CFG simplification, sometimes interspersed with bitcasts and all-constant-indices GEPs. With this patch`areConsecutivePointers` method would attempt to handle select instructions. This leads to an increased number of successful vectorizations. Technically, select instructions could appear in index arithmetic as well, however, we don't see those in our test suites / benchmarks. Also, there is a lot more freedom in IR shapes computing integral indices in general than in what's common in pointer computations, and it appears that it's quite unreliable to do anything short of making select instructions first class citizens of Scalar Evolution, which for the purposes of this patch is most definitely an overkill. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D49428 llvm-svn: 337965
* [GlobalMerge] Allow merging globals with arbitrary alignment.Eli Friedman2018-07-251-18/+26
| | | | | | | | | Instead of depending on implicit padding from the structure layout code, use a packed struct and emit the padding explicitly. Differential Revision: https://reviews.llvm.org/D49710 llvm-svn: 337961
* Revert r337904: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵Florian Hahn2018-07-252-140/+10
| | | | | | | | instructions. I suspect it is causing the clang-stage2-Rthinlto failures. llvm-svn: 337956
* Add missing 'override', fixing compilation with some compilers since SVN r337950Martin Storsjo2018-07-251-1/+1
| | | | llvm-svn: 337952
* [COFF] Use comdat shared constants for MinGW as wellMartin Storsjo2018-07-253-15/+2
| | | | | | | | | | | | | | | | | GNU binutils tools have no problems with this kind of shared constants, provided that we actually hook it up completely in AsmPrinter and produce a global symbol. This effectively reverts SVN r335918 by hooking the rest of it up properly. This feature was implemented originally in SVN r213006, with no reason for why it can't be used for MinGW other than the fact that GCC doesn't do it while MSVC does. Differential Revision: https://reviews.llvm.org/D49646 llvm-svn: 337951
* [COFF] Hoist constant pool handling from X86AsmPrinter into AsmPrinterMartin Storsjo2018-07-256-30/+33
| | | | | | | | | | | | | | | | | | | In SVN r334523, the first half of comdat constant pool handling was hoisted from X86WindowsTargetObjectFile (which despite the name only was used for msvc targets) into the arch independent TargetLoweringObjectFileCOFF, but the other half of the handling was left behind in X86AsmPrinter::GetCPISymbol. With only half of the handling in place, inconsistent comdat sections/symbols are created, causing issues with both GNU binutils (avoided for X86 in SVN r335918) and with the MS linker, which would complain like this: fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x4 Differential Revision: https://reviews.llvm.org/D49644 llvm-svn: 337950
* [ARM] Prefer lsls+lsrs over lsls+ands or lsrs+ands in Thumb1.Eli Friedman2018-07-251-0/+81
| | | | | | | | | | | | | Saves materializing the immediate for the "ands". Corresponding patterns exist for lsrs+lsls, but that seems less common in practice. Now implemented as a DAGCombine. Differential Revision: https://reviews.llvm.org/D49585 llvm-svn: 337945
* [SCEV] Add [zs]ext{C,+,x} -> (D + [zs]ext{C-D,+,x})<nuw><nsw> transformRoman Tereshin2018-07-251-63/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | as well as sext(C + x + ...) -> (D + sext(C-D + x + ...))<nuw><nsw> similar to the equivalent transformation for zext's if the top level addition in (D + (C-D + x * n)) could be proven to not wrap, where the choice of D also maximizes the number of trailing zeroes of (C-D + x * n), ensuring homogeneous behaviour of the transformation and better canonicalization of such AddRec's (indeed, there are 2^(2w) different expressions in `B1 + ext(B2 + Y)` form for the same Y, but only 2^(2w - k) different expressions in the resulting `B3 + ext((B4 * 2^k) + Y)` form, where w is the bit width of the integral type) This patch generalizes sext(C1 + C2*X) --> sext(C1) + sext(C2*X) and sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformations added in r209568 relaxing the requirements the following way: 1. C2 doesn't have to be a power of 2, it's enough if it's divisible by 2 a sufficient number of times; 2. C1 doesn't have to be less than C2, instead of extracting the entire C1 we can split it into 2 terms: (00...0XXX + YY...Y000), keep the second one that may cause wrapping within the extension operator, and move the first one that doesn't affect wrapping out of the extension operator, enabling further simplifications; 3. C1 and C2 don't have to be positive, splitting C1 like shown above produces a sum that is guaranteed to not wrap, signed or unsigned; 4. in AddExpr case there could be more than 2 terms, and in case of AddExpr the 2nd and following terms and in case of AddRecExpr the Step component don't have to be in the C2*X form or constant (respectively), they just need to have enough trailing zeros, which in turn could be guaranteed by means other than arithmetics, e.g. by a pointer alignment; 5. the extension operator doesn't have to be a sext, the same transformation works and profitable for zext's as well. Apparently, optimizations like SLPVectorizer currently fail to vectorize even rather trivial cases like the following: double bar(double *a, unsigned n) { double x = 0.0; double y = 0.0; for (unsigned i = 0; i < n; i += 2) { x += a[i]; y += a[i + 1]; } return x * y; } If compiled with `clang -std=c11 -Wpedantic -Wall -O3 main.c -S -o - -emit-llvm` (!{!"clang version 7.0.0 (trunk 337339) (llvm/trunk 337344)"}) it produces scalar code with the loop not unrolled with the unsigned `n` and `i` (like shown above), but vectorized and unrolled loop with signed `n` and `i`. With the changes made in this commit the unsigned version will be vectorized (though not unrolled for unclear reasons). How it all works: Let say we have an AddExpr that looks like (C + x + y + ...), where C is a constant and x, y, ... are arbitrary SCEVs. Let's compute the minimum number of trailing zeroes guaranteed of that sum w/o the constant term: (x + y + ...). If, for example, those terms look like follows: i XXXX...X000 YYYY...YY00 ... ZZZZ...0000 then the rightmost non-guaranteed-zero bit (a potential one at i-th position above) can change the bits of the sum to the left (and at i-th position itself), but it can not possibly change the bits to the right. So we can compute the number of trailing zeroes by taking a minimum between the numbers of trailing zeroes of the terms. Now let's say that our original sum with the constant is effectively just C + X, where X = x + y + .... Let's also say that we've got 2 guaranteed trailing zeros for X: j CCCC...CCCC XXXX...XX00 // this is X = (x + y + ...) Any bit of C to the left of j may in the end cause the C + X sum to wrap, but the rightmost 2 bits of C (at positions j and j - 1) do not affect wrapping in any way. If the upper bits cause a wrap, it will be a wrap regardless of the values of the 2 least significant bits of C. If the upper bits do not cause a wrap, it won't be a wrap regardless of the values of the 2 bits on the right (again). So let's split C to 2 constants like follows: 0000...00CC = D CCCC...CC00 = (C - D) and represent the whole sum as D + (C - D + X). The second term of this new sum looks like this: CCCC...CC00 XXXX...XX00 ----------- // let's add them up YYYY...YY00 The sum above (let's call it Y)) may or may not wrap, we don't know, so we need to keep it under a sext/zext. Adding D to that sum though will never wrap, signed or unsigned, if performed on the original bit width or the extended one, because all that that final add does is setting the 2 least significant bits of Y to the bits of D: YYYY...YY00 = Y 0000...00CC = D ----------- <nuw><nsw> YYYY...YYCC Which means we can safely move that D out of the sext or zext and claim that the top-level sum neither sign wraps nor unsigned wraps. Let's run an example, let's say we're working in i8's and the original expression (zext's or sext's operand) is 21 + 12x + 8y. So it goes like this: 0001 0101 // 21 XXXX XX00 // 12x YYYY Y000 // 8y 0001 0101 // 21 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D 0001 0100 // 21 - D = 20 ZZZZ ZZ00 // 12x + 8y 0000 0001 // D WWWW WW00 // 21 - D + 12x + 8y = 20 + 12x + 8y therefore zext(21 + 12x + 8y) = (1 + zext(20 + 12x + 8y))<nuw><nsw> This approach could be improved if we move away from using trailing zeroes and use KnownBits instead. For instance, with KnownBits we could have the following picture: i 10 1110...0011 // this is C XX X1XX...XX00 // this is X = (x + y + ...) Notice that some of the bits of X are known ones, also notice that known bits of X are interspersed with unknown bits and not grouped on the rigth or left. We can see at the position i that C(i) and X(i) are both known ones, therefore the (i + 1)th carry bit is guaranteed to be 1 regardless of the bits of C to the right of i. For instance, the C(i - 1) bit only affects the bits of the sum at positions i - 1 and i, and does not influence if the sum is going to wrap or not. Therefore we could split the constant C the following way: i 00 0010...0011 = D 10 1100...0000 = (C - D) Let's compute the KnownBits of (C - D) + X: XX1 1 = carry bit, blanks stand for known zeroes 10 1100...0000 = (C - D) XX X1XX...XX00 = X --- ----------- XX X0XX...XX00 Will this add wrap or not essentially depends on bits of X. Adding D to this sum, however, is guaranteed to not to wrap: 0 X 00 0010...0011 = D sX X0XX...XX00 = (C - D) + X --- ----------- sX XXXX XX11 As could be seen above, adding D preserves the sign bit of (C - D) + X, if any, and has a guaranteed 0 carry out, as expected. The more bits of (C - D) we constrain, the better the transformations introduced here canonicalize expressions as it leaves less freedom to what values the constant part of ((C - D) + x + y + ...) can take. Reviewed By: mzolotukhin, efriedma Differential Revision: https://reviews.llvm.org/D48853 llvm-svn: 337943
* Add an option to specify the name ofXinliang David Li2018-07-251-0/+11
| | | | | | | | an function whose CFG is to be viewed/printed. Differential Revision: https://reviews.llvm.org/D49447 llvm-svn: 337940
* Fix corruption of result number in LegalizeVectorOps.cppUlrich Weigand2018-07-251-1/+2
| | | | | | | | | | | | When VectorLegalizer::LegalizeOp creates a new SDValue after iterating over its arguments, we need to refer to the same result number of the new node that the original value used. Reviewed by: cameron.mcinally Differential Revision: https://reviews.llvm.org/D49805 llvm-svn: 337939
* [AMDGPU] Use AssumptionCacheTracker in the divrem32 expansionStanislav Mekhanoshin2018-07-251-13/+21
| | | | | | Differential Revision: https://reviews.llvm.org/D49761 llvm-svn: 337938
* Fix llvm::ComputeNumSignBits with some operations and llvm.assumeStanislav Mekhanoshin2018-07-251-7/+7
| | | | | | | | | | Currently ComputeNumSignBits does early exit while processing some of the operations (add, sub, mul, and select). This prevents the function from using AssumptionCacheTracker if passed. Differential Revision: https://reviews.llvm.org/D49759 llvm-svn: 337936
* Revert "dwarfgen: Add support for generating the debug_str_offsets section, ↵Pavel Labath2018-07-255-32/+27
| | | | | | | | | take 2" This reverts commit r337933. The build error is fixed but the test now fails on the darwin buildbots. Investigating... llvm-svn: 337935
* [Hexagon] Properly scale bit index when extracting elements from vNi1Krzysztof Parzyszek2018-07-251-1/+3
| | | | | | | | For example v = <2 x i1> is represented as bbbbaaaa in a predicate register, where b = v[1], a = v[0]. Extracting v[1] is equivalent to extracting bit 4 from the predicate register. llvm-svn: 337934
* dwarfgen: Add support for generating the debug_str_offsets section, take 2Pavel Labath2018-07-255-27/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This recommits r337910 after fixing an "ambiguous call to addAttribute" error with some compilers (gcc circa 4.9 and MSVC). It seems that these compilers will consider a "false -> pointer" conversion during overload resolution. This creates ambiguity because one I added an overload which takes a MCExpr * as an argument. I fix this by making the new overload take MCExpr&, which avoids the conversion. It also documents the fact that we expect a valid MCExpr object. Original commit message follows: The motivation for this is D49493, where we'd like to test details of debug_str_offsets behavior which is difficult to trigger from a traditional test. This adds the plubming necessary for dwarfgen to generate this section. The more interesting changes are: - I've moved emitStringOffsetsTableHeader function from DwarfFile to DwarfStringPool, so I can generate the section header more easily from the unit test. - added a new addAttribute overload taking an MCExpr*. This is used to generate the DW_AT_str_offsets_base, which links a compile unit to the offset table. I've also added a basic test for reading and writing DW_form_strx forms. Reviewers: dblaikie, JDevlieghere, probinson Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D49670 llvm-svn: 337933
* Move JIT listener C binding fallbackks to ExecutionEngineBindings.cpp.Andres Freund2018-07-251-0/+24
| | | | | | | | | | | | | | | | | | | | | Initially, in https://reviews.llvm.org/D44890, I had these defined as empty functions inside the header when the respective event listener was not built in. As done in that commit, that wasn't correct, because it was a ODR violation. Krasimir hot-fixed that in r333265, but that wasn't quite right either, because it'd lead to the symbol not being available. Instead just move the fallbacksto ExecutionEngineBindings.cpp. Could define them as static inlines in the header too, but I don't think it matters. Reviewers: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49654 llvm-svn: 337930
* Revert "dwarfgen: Add support for generating the debug_str_offsets section"Pavel Labath2018-07-255-32/+27
| | | | | | | | | This reverts commit r337910 as it's generating "ambiguous call to addAttribute" errors on some bots. Will resubmit once I get a chance to look into the problem. llvm-svn: 337924
* [MIPS GlobalISel] Lower pointer argumentsPetar Jovanovic2018-07-252-1/+3
| | | | | | | | | | | | Add support for lowering pointer arguments. Changing type from pointer to integer is already done in MipsTargetLowering::getRegisterTypeForCallingConv. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D49419 llvm-svn: 337912
* dwarfgen: Add support for generating the debug_str_offsets sectionPavel Labath2018-07-255-27/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The motivation for this is D49493, where we'd like to test details of debug_str_offsets behavior which is difficult to trigger from a traditional test. This adds the plubming necessary for dwarfgen to generate this section. The more interesting changes are: - I've moved emitStringOffsetsTableHeader function from DwarfFile to DwarfStringPool, so I can generate the section header more easily from the unit test. - added a new addAttribute overload taking an MCExpr*. This is used to generate the DW_AT_str_offsets_base, which links a compile unit to the offset table. I've also added a basic test for reading and writing DW_form_strx forms. Reviewers: dblaikie, JDevlieghere, probinson Subscribers: llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D49670 llvm-svn: 337910
* [SystemZ] Use tablegen loops in SchedModelsJonas Paulsson2018-07-255-229/+98
| | | | | | | | | | NFC changes to make scheduler TableGen files more readable, by using loops instead of a lot of similar defs with just e.g. a latency value that changes. https://reviews.llvm.org/D49598 Review: Ulrich Weigand, Javed Abshar llvm-svn: 337909
* Recommit r333268: [IPSCCP] Use PredicateInfo to propagate facts from cmp ↵Florian Hahn2018-07-252-10/+140
| | | | | | | | | | | | | | | | | | | | instructions. r337828 resolves a PredicateInfo issue with unnamed types. Original message: This patch updates IPSCCP to use PredicateInfo to propagate facts to true branches predicated by EQ and to false branches predicated by NE. As a follow up, we should be able to extend it to also propagate additional facts about nonnull. Reviewers: davide, mssimpso, dberlin, efriedma Reviewed By: davide, dberlin llvm-svn: 337904
* Fix PR34170: Crash on inline asm with 64bit output in 32bit GPRThomas Preud'homme2018-07-251-20/+36
| | | | | | | | Add support for inline assembly with output operand that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR). llvm-svn: 337903
* [llvm-objdump] Add dynamic section printing to private-headers optionPaul Semel2018-07-251-0/+138
| | | | | | Differential Revision: https://reviews.llvm.org/D49016 llvm-svn: 337902
* [x86/SLH] Sink the return hardening into the main block-walk + hardeningChandler Carruth2018-07-251-26/+17
| | | | | | | | | | | code. This consolidates all our hardening calls, and simplifies the code a bit. It seems much more clear to handle all of these together. No functionality changed here. llvm-svn: 337895
* [x86/SLH] Improve name and comments for the main hardening function.Chandler Carruth2018-07-251-174/+190
| | | | | | | | | | | | | | | | | | | This function actually does two things: it traces the predicate state through each of the basic blocks in the function (as that isn't directly handled by the SSA updater) *and* it hardens everything necessary in the block as it goes. These need to be done together so that we have the currently active predicate state to use at each point of the hardening. However, this also made obvious that the flag to disable actual hardening of loads was flawed -- it also disabled tracing the predicate state across function calls within the body of each block. So this patch sinks this debugging flag test to correctly guard just the hardening of loads. Unless load hardening was disabled, no functionality should change with tis patch. llvm-svn: 337894
* [mips] Replace custom parsing logic for data directives by the ↵Simon Atanasyan2018-07-253-42/+12
| | | | | | | | | | | | | | | | | | | | | `addAliasForDirective` The target independent AsmParser doesn't recognise .hword, .word, .dword which are required for Mips. Currently MipsAsmParser recognises these through dispatch to MipsAsmParser::parseDataDirective. This contains equivalent logic to AsmParser::parseDirectiveValue. This patch allows reuse of AsmParser::parseDirectiveValue by making use of addAliasForDirective to support .hword, .word and .dword. Original patch provided by Alex Bradbury at D47001 was modified to fix handling of microMIPS symbols. The `AsmParser::parseDirectiveValue` calls either `EmitIntValue` or `EmitValue`. In this patch we override `EmitIntValue` in the `MipsELFStreamer` to clear a pending set of microMIPS symbols. Differential revision: https://reviews.llvm.org/D49539 llvm-svn: 337893
* [Dominators] Assert if there is modification to DelBB while it is awaiting ↵Chijun Sima2018-07-251-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | deletion Summary: Previously, passes use ``` DomTreeUpdater DTU(DT, DomTreeUpdater::UpdateStrategy::Lazy); DTU.deleteBB(DelBB); ``` to delete a BasicBlock. But passes which don't have the ability to update DomTree (e.g. tailcallelim, simplifyCFG) cannot recognize a DelBB awaiting deletion and will continue to process this DelBB. This is a simple approach to notify devs of passes which may use DTU in the future to deal with deleted BasicBlocks under Lazy Strategy correctly. Reviewers: kuhar, brzycki, dmgreen Reviewed By: kuhar Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D49731 llvm-svn: 337891
* [X86] Use X86ISD::MUL_IMM instead of ISD::MUL for multiply we intend to be ↵Craig Topper2018-07-251-1/+2
| | | | | | | | selected to LEA. This prevents other combines from possibly disturbing it. llvm-svn: 337890
* [RegisterBankInfo] Ignore InstrMappings that create impossible to repair ↵Tom Stellard2018-07-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | operands Summary: This is a follow-up to r303043. In computeMapping(), we need to disqualify an InstrMapping if it would be impossible to repair one of the registers in the instruction to match the mapping. This change is needed in order to be able to define an instruction mapping for G_SELECT for the AMDGPU target and will be tested by test/CodeGen/AMDGPU/GlobalISel/regbankselect-select.mir Reviewers: ab, qcolombet, t.p.northover, dsanders Reviewed By: qcolombet Subscribers: tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D49735 llvm-svn: 337882
* [profile] Support profiling runtime on FuchsiaPetr Hosek2018-07-251-0/+1
| | | | | | | | | | | | | This ports the profiling runtime on Fuchsia and enables the instrumentation. Unlike on other platforms, Fuchsia doesn't use files to dump the instrumentation data since on Fuchsia, filesystem may not be accessible to the instrumented process. We instead use the data sink to pass the profiling data to the system the same sanitizer runtimes do. Differential Revision: https://reviews.llvm.org/D47208 llvm-svn: 337881
* [x86/SLH] Teach the x86 speculative load hardening pass to hardenChandler Carruth2018-07-251-0/+200
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | against v1.2 BCBS attacks directly. Attacks using spectre v1.2 (a subset of BCBS) are described in the paper here: https://people.csail.mit.edu/vlk/spectre11.pdf The core idea is to speculatively store over the address in a vtable, jumptable, or other target of indirect control flow that will be subsequently loaded. Speculative execution after such a store can forward the stored value to subsequent loads, and if called or jumped to, the speculative execution will be steered to this potentially attacker controlled address. Up until now, this could be mitigated by enableing retpolines. However, that is a relatively expensive technique to mitigate this particular flavor. Especially because in most cases SLH will have already mitigated this. To fully mitigate this with SLH, we need to do two core things: 1) Unfold loads from calls and jumps, allowing the loads to be post-load hardened. 2) Force hardening of incoming registers even if we didn't end up needing to harden the load itself. The reason we need to do these two things is because hardening calls and jumps from this particular variant is importantly different from hardening against leak of secret data. Because the "bad" data here isn't a secret, but in fact speculatively stored by the attacker, it may be loaded from any address, regardless of whether it is read-only memory, mapped memory, or a "hardened" address. The only 100% effective way to harden these instructions is to harden the their operand itself. But to the extent possible, we'd like to take advantage of all the other hardening going on, we just need a fallback in case none of that happened to cover the particular input to the control transfer instruction. For users of SLH, currently they are paing 2% to 6% performance overhead for retpolines, but this mechanism is expected to be substantially cheaper. However, it is worth reminding folks that this does not mitigate all of the things retpolines do -- most notably, variant #2 is not in *any way* mitigated by this technique. So users of SLH may still want to enable retpolines, and the implementation is carefuly designed to gracefully leverage retpolines to avoid the need for further hardening here when they are enabled. Differential Revision: https://reviews.llvm.org/D49663 llvm-svn: 337878
* [X86] Use a shift plus an lea for multiplying by a constant that is a power ↵Craig Topper2018-07-251-0/+18
| | | | | | | | of 2 plus 2/4/8. The LEA allows us to combine an add and the multiply by 2/4/8 together so we just need a shift for the larger power of 2. llvm-svn: 337875
* [X86] Expand mul by pow2 + 2 using a shift and two adds similar to what we ↵Craig Topper2018-07-251-11/+15
| | | | | | do for pow2 - 2. llvm-svn: 337874
OpenPOWER on IntegriCloud