summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [RISCV] Fix incorrect use of MCInstBuilderRoger Ferrer Ibanez2018-08-141-8/+6
| | | | | | | | | | | | | | | | | | This is a fix for r339314. MCInstBuilder uses the named parameter idiom and an 'operator MCInst&' to ease the creation of MCInsts. As the object of MCInstBuilder owns the MCInst is manipulating, the lifetime of the MCInst is bound to that of MCInstBuilder. In r339314 I bound a reference to the MCInst in an initializer. The temporary of MCInstBuilder (and also its MCInst) is destroyed at the end of the declaration leading to a dangling reference. Fix this by using MCInstBuilder inside an argument of a function call. Temporaries in function calls are destroyed in the enclosing full expression, so the the reference to MCInst is still valid when emitToStreamer executes. llvm-svn: 339654
* Test commit: fix punctuationChih-Mao Chen2018-08-141-1/+1
| | | | llvm-svn: 339652
* [X86] Lowering addus/subus intrinsics to native IRTomasz Krupa2018-08-143-22/+207
| | | | | | | | | | | | | | | | | | | Summary: This revision improves previous version (rL330322) which has been reverted due to crashes. This is the patch that lowers x86 intrinsics to native IR in order to enable optimizations. The patch also includes folding of previously missing saturation patterns so that IR emits the same machine instructions as the intrinsics. Reviewers: craig.topper, spatel, RKSimon Reviewed By: craig.topper Subscribers: mike.dvoretsky, DavidKreitzer, sroland, llvm-commits Differential Revision: https://reviews.llvm.org/D46179 llvm-svn: 339650
* [ARM] ParallelDSP: add option to enable/disable the passSjoerd Meijer2018-08-141-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D50511 llvm-svn: 339645
* [ThinLTO] Fix printing of WPD remarksTeresa Johnson2018-08-141-2/+4
| | | | | | | | | | | | | | | | | Summary: When WPD is performed in a ThinLTO backend, the function may be created if it isn't already in that module. Module::getOrInsertFunction may add a bitcast, in which case the returned Constant is not a Function and doesn't have a name. Invoke stripPointerCasts() on the returned value where we access its name. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49959 llvm-svn: 339640
* [ThinLTO] Handle optional args in assembly format for ConstVCallsTeresa Johnson2018-08-142-4/+12
| | | | | | | | | | | | | | | | Summary: The AsmWriter was only writing the Args for a ConstVCall if it was non-empty, however, the LLParser was always expecting it. To aid in making it optional, surround the ConstVCall VFuncId and Args in parentheses when writing, then make the Args optional when reading. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D49960 llvm-svn: 339637
* [BasicAA] Don't assume tail calls with byval don't alias allocasReid Kleckner2018-08-141-6/+7
| | | | | | | | | | | | | | | | | | | | Summary: Calls marked 'tail' cannot read or write allocas from the current frame because the current frame might be destroyed by the time they run. However, a tail call may use an alloca with byval. Calling with byval copies the contents of the alloca into argument registers or stack slots, so there is no lifetime issue. Tail calls never modify allocas, so we can return just ModRefInfo::Ref. Fixes PR38466, a longstanding bug. Reviewers: hfinkel, nlewycky, gbiv, george.burgess.iv Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50679 llvm-svn: 339636
* Revert "[WebAssembly] Added default stack-only instruction mode for MC."Wouter van Oortmerssen2018-08-136-487/+256
| | | | | | This reverts commit 917a99b71ce21c975be7bfbf66f4040f965d9f3c. llvm-svn: 339630
* [Support] NFC: Allow modifying access/modification times independently in ↵Jordan Rupprecht2018-08-132-7/+14
| | | | | | | | | | | | | | | | | sys::fs::setLastModificationAndAccessTime. Summary: Add an overload to sys::fs::setLastModificationAndAccessTime that allows setting last access and modification times separately. This will allow tools to use this API when they want to preserve both the access and modification times from an input file, which may be different. Also note that both the POSIX (futimens/futimes) and Windows (SetFileTime) APIs take the two timestamps in the order of (1) access (2) modification time, so this renames the method to "setLastAccessAndModificationTime" to make it clear which timestamp is which. For existing callers, the 1-arg overload just sets both timestamps to the same thing. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50521 llvm-svn: 339628
* [AST] Minor formatting cleanup [NFC]Philip Reames2018-08-131-8/+4
| | | | llvm-svn: 339627
* [AST] Cleanup code by using MemoryLocation utility [NFC]Philip Reames2018-08-131-43/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D50588 llvm-svn: 339625
* [X86] Don't ignore 0x66 prefix on relative jumps in 64-bit mode. Fix opcode ↵Craig Topper2018-08-131-39/+12
| | | | | | | | | | | | | | | | selection of relative jumps in 16-bit mode. Treat jno/jo like other jcc instructions. The behavior in 64-bit mode is different between Intel and AMD CPUs. Intel ignores the 0x66 prefix. AMD does not. objump doesn't ignore the 0x66 prefix. Since LLVM aims to match objdump behavior, we should do the same. While I was trying to fix this I had change brtarget16/32 to use ENCODING_IW/ID instead of ENCODING_Iv to get the 0x66+REX.W case to act sort of sanely. It's still wrong, but that's a problem for another day. The change in encoding exposed the fact that 16-bit mode disassembly of relative jumps was creating JMP_4 with a 2 byte immediate. It should have been JMP_2. From just printing you can't tell the difference, but if you dumped the encoding it wouldn't have matched what we started with. While fixing that, it exposed that jo/jno opcodes were missing from the switch that this patch deleted and there were no test cases for them. Fixes PR38537. llvm-svn: 339622
* [InstCombine] Re-land: Optimize redundant 'signed truncation check pattern'.Roman Lebedev2018-08-131-0/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250): ``` signed char test(unsigned int x) { return x; } ``` `clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3` * Old: {F6904292} * With this patch: {F6904294} General pattern: X & Y Where `Y` is checking that all the high bits (covered by a mask `4294967168`) are uniform, i.e. `%arg & 4294967168` can be either `4294967168` or `0` Pattern can be one of: %t = add i32 %arg, 128 %r = icmp ult i32 %t, 256 Or %t0 = shl i32 %arg, 24 %t1 = ashr i32 %t0, 24 %r = icmp eq i32 %t1, %arg Or %t0 = trunc i32 %arg to i8 %t1 = sext i8 %t0 to i32 %r = icmp eq i32 %t1, %arg This pattern is a signed truncation check. And `X` is checking that some bit in that same mask is zero. I.e. can be one of: %r = icmp sgt i32 %arg, -1 Or %t = and i32 %arg, 2147483648 %r = icmp eq i32 %t, 0 Since we are checking that all the bits in that mask are the same, and a particular bit is zero, what we are really checking is that all the masked bits are zero. So this should be transformed to: %r = icmp ult i32 %arg, 128 The transform itself ended up being rather horrible, even though i omitted some cases. Surely there is some infrastructure that can help clean this up that i missed? https://rise4fun.com/Alive/3Ou The initial commit (rL339610) was reverted, since the first assert was being triggered. The @positive_with_extra_and test now has coverage for that case. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: RKSimon, erichkeane, vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D50465 llvm-svn: 339621
* [SimplifyLibCalls] don't drop fast-math-flags on trig reflection folds ↵Sanjay Patel2018-08-131-1/+6
| | | | | | | | | | (retry r339608) Even though this code is below a function called optimizeFloatingPointLibCall(), we apparently can't guarantee that we're dealing with FPMathOperators, so bail out immediately if that's not true. llvm-svn: 339618
* Revert "[InstCombine] Optimize redundant 'signed truncation check pattern'."Roman Lebedev2018-08-131-128/+0
| | | | | | | | | At least one buildbot was able to actually trigger that assert on the top of the function. Will investigate. This reverts commit r339610. llvm-svn: 339612
* [InstCombine] Optimize redundant 'signed truncation check pattern'.Roman Lebedev2018-08-131-0/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This comes with `Implicit Conversion Sanitizer - integer sign change` (D50250): ``` signed char test(unsigned int x) { return x; } ``` `clang++ -fsanitize=implicit-conversion -S -emit-llvm -o - /tmp/test.cpp -O3` * Old: {F6904292} * With this patch: {F6904294} General pattern: X & Y Where `Y` is checking that all the high bits (covered by a mask `4294967168`) are uniform, i.e. `%arg & 4294967168` can be either `4294967168` or `0` Pattern can be one of: %t = add i32 %arg, 128 %r = icmp ult i32 %t, 256 Or %t0 = shl i32 %arg, 24 %t1 = ashr i32 %t0, 24 %r = icmp eq i32 %t1, %arg Or %t0 = trunc i32 %arg to i8 %t1 = sext i8 %t0 to i32 %r = icmp eq i32 %t1, %arg This pattern is a signed truncation check. And `X` is checking that some bit in that same mask is zero. I.e. can be one of: %r = icmp sgt i32 %arg, -1 Or %t = and i32 %arg, 2147483648 %r = icmp eq i32 %t, 0 Since we are checking that all the bits in that mask are the same, and a particular bit is zero, what we are really checking is that all the masked bits are zero. So this should be transformed to: %r = icmp ult i32 %arg, 128 https://rise4fun.com/Alive/3Ou Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: RKSimon, erichkeane, vsk, llvm-commits Differential Revision: https://reviews.llvm.org/D50465 llvm-svn: 339610
* revert r339608 - [SimplifyLibCalls] don't drop fast-math-flags on trig ↵Sanjay Patel2018-08-131-3/+1
| | | | | | | | | reflection folds Can't set the builder flags without knowing this is an FPMathOperator. I'll add a test for that and try again. llvm-svn: 339609
* [SimplifyLibCalls] don't drop fast-math-flags on trig reflection foldsSanjay Patel2018-08-131-1/+3
| | | | llvm-svn: 339608
* [SimplifyLibCalls] add reflection fold for -sin(-x) (PR38458)Sanjay Patel2018-08-131-15/+27
| | | | | | | | | | This is a very partial fix for the reported problem. I suspect we do not get this fold in most motivating cases because most of the time, the libcall would have been replaced by an intrinsic, and that optimization is handled elsewhere...but maybe it should be handled here? llvm-svn: 339604
* [CodeGen] Fix assert in SelectionDAG::computeKnownBitsScott Linder2018-08-131-2/+2
| | | | | | | | | | Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR when zero extending the demanded elements mask if it is already as long as the source vector. Differential Revision: https://reviews.llvm.org/D49574 llvm-svn: 339600
* [X86][BtVer2] Use NoSchedPredicate to model default transitions in variant ↵Andrea Di Biagio2018-08-131-8/+8
| | | | | | scheduling classes. NFC. llvm-svn: 339589
* [SimplifyLibCalls] reduce code for optimizeCos; NFCISanjay Patel2018-08-131-9/+8
| | | | llvm-svn: 339588
* [InstCombine] Limit simplifyAllocaArraySize constant folding to values that ↵Simon Pilgrim2018-08-131-24/+26
| | | | | | | | fit into a uint64_t Fixes OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5223 llvm-svn: 339584
* [itanium demangler] Add llvm::itaniumFindTypesInMangledName()Erik Pilkington2018-08-131-1/+15
| | | | | | | | | | | | This function calls a callback whenever a <type> is parsed. This is necessary to implement FindAlternateFunctionManglings in LLDB, which uses a similar hack in FastDemangle. Once that function has been updated to use this version, FastDemangle can finally be removed. Differential revision: https://reviews.llvm.org/D50586 llvm-svn: 339580
* [SLC] Expand simplification of pow() for vector typesEvandro Menezes2018-08-131-40/+37
| | | | | | | | Also consider vector constants when simplifying `pow()`. Differential revision: https://reviews.llvm.org/D50035 llvm-svn: 339578
* [Hexagon] Silence -Wuninitialized warning from GCC 5.4, NFCKrzysztof Parzyszek2018-08-131-0/+4
| | | | | | | | Patch by Kim Gräsman. Differential Revision: https://reviews.llvm.org/D50623 llvm-svn: 339576
* Revert "[Sparc] Add support for the cycle counter available in GR740"Daniel Cederman2018-08-135-21/+2
| | | | | | | It breaks when using EXPENSIVE_CHECKS with the error message "Bad machine code: Using an undefined physical register". llvm-svn: 339570
* Check for tied operandsSid Manning2018-08-131-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D50592 llvm-svn: 339567
* [SystemZ] Increase the amount of inlining.Jonas Paulsson2018-08-131-0/+2
| | | | | | | Implement getInliningThresholdMultiplier() and have it return 3. Review: Ulrich Weigand llvm-svn: 339563
* [DAGCombiner] simplifyDivRem - add comment describing divide by undef/zero ↵Simon Pilgrim2018-08-131-0/+5
| | | | | | combine. NFC. llvm-svn: 339561
* [CGP] Fix GEP issue with out of range APInt constant values not fitting in ↵Simon Pilgrim2018-08-131-2/+7
| | | | | | | | int64_t Test case reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7173 llvm-svn: 339556
* [Sparc] Add support for the cycle counter available in GR740Daniel Cederman2018-08-135-2/+21
| | | | | | | | | | | | | | | | | | Summary: The GR740 provides an up cycle counter in the registers ASR22 and ASR23. As these registers can not be read together atomically we only use the value of ASR23 for llvm.readcyclecounter(). The ASR23 register holds the 32 LSBs of the up-counter. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D48638 llvm-svn: 339551
* [ARM] Added FP16 VREV Vector Instrinsic CodeGen supportLuke Geeson2018-08-131-0/+2
| | | | llvm-svn: 339546
* [GuardWidening] Widen very likely non-taken br instructionsMax Kazantsev2018-08-131-24/+47
| | | | | | | | | | This is a second part of D49974 that handles widening of conditional branches that have very likely `false` branch. Differential Revision: https://reviews.llvm.org/D50040 Reviewed By: reames llvm-svn: 339537
* [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the ↵Craig Topper2018-08-131-8/+11
| | | | | | | | fp_to_fp16 in case the result type isn't a scalar integer. This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary. llvm-svn: 339536
* [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, ↵Craig Topper2018-08-131-2/+2
| | | | | | | | | | make sure the output type is scalar. For vectors, use a store and load of temporary. Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid. This is basically the opposite case of the root cause of PR38533. llvm-svn: 339535
* Restore correct x86_64 EH encodings in kernel code modelLei Liu2018-08-131-9/+14
| | | | | | | | | | | | | Fixes PR37524. The exception handling encodings for x86_64 in kernel code model has been changed with r309884. Restore it to correct ones. These encodings include PersonalityEncoding, LSDAEncoding and TTypeEncoding. Differential Revision: https://reviews.llvm.org/D50490 llvm-svn: 339534
* [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the ↵Craig Topper2018-08-131-2/+4
| | | | | | | | | | fp16_to_fp in case the input type isn't an i16. The bitcast can be further legalized as needed. Fixes PR38533. llvm-svn: 339533
* [InstCombine] Fix typo in comment. NFCCraig Topper2018-08-131-1/+1
| | | | llvm-svn: 339532
* [InstCombine] Replace call to haveNoCommonBitsSet in visitXor with just the ↵Craig Topper2018-08-131-2/+8
| | | | | | | | | | | | | | | | special case that doesn't use computeKnownBits. Summary: computeKnownBits is expensive. The cases that would be detected by the computeKnownBits portion of haveNoCommonBitsSet were already handled by the earlier call to SimplifyDemandedInstructionBits. Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50604 llvm-svn: 339531
* [X86] Add constant folding for AVX512 versions of scalar floating point to ↵Craig Topper2018-08-121-5/+76
| | | | | | | | | | | | | | | | | | | integer conversion intrinsics. Summary: We've supported constant folding for sse versions for many years. This patch adds support for the avx512 versions including unsigned with the default rounding mode. We could probably do more with other roundings modes and SAE in the future. The test cases are largely based on the sse.ll test cases. But I did add some test cases to ensure the unsigned versions don't accept negative values. Also checked the bounds of f64->i32 conversions to make sure unsigned has a larger positive range than signed. Reviewers: RKSimon, spatel, chandlerc Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50553 llvm-svn: 339529
* DAG: Check no-signed-zeros instead of unsafe-fp-mathMatt Arsenault2018-08-121-3/+3
| | | | | | | Addresses fixme, although this should still be checking individual operand flags. llvm-svn: 339525
* [InstCombine] Fold Select with binary op - non-commutative opcodesDavid Bolvansky2018-08-121-2/+3
| | | | | | | | | | | | | | | | | | | Summary: Basic version was merged - https://reviews.llvm.org/D49954 This adds support for FP & non-commutative opcodes Precommited tests: https://reviews.llvm.org/rL338727 Reviewers: spatel, lebedev.ri Reviewed By: spatel Subscribers: jfb Differential Revision: https://reviews.llvm.org/D50190 llvm-svn: 339520
* [InstCombine] fix/enhance fadd/fsub factorizationSanjay Patel2018-08-121-87/+45
| | | | | | | | | | | | | (X * Z) + (Y * Z) --> (X + Y) * Z (X * Z) - (Y * Z) --> (X - Y) * Z (X / Z) + (Y / Z) --> (X + Y) / Z (X / Z) - (Y / Z) --> (X - Y) / Z The existing code that implemented these folds failed to optimize vectors, and it transformed code with multiple uses when it should not have. llvm-svn: 339519
* [InstSimplify] Guard against large shift amounts.Benjamin Kramer2018-08-121-3/+3
| | | | | | | These are always UB, but can happen for large integer inputs. Testing it is very fragile as -simplifycfg will nuke the UB top-down. llvm-svn: 339515
* AMDGPU: Check NSZ MI flag when folding omodMatt Arsenault2018-08-121-4/+6
| | | | | | | | I'm not sure the exact nsz flag combination that is OK. I think as long as it's on either, this is OK. For now just check it on the omod multiply. llvm-svn: 339513
* AMDGPU: Use splat vectors for undefs when folding canonicalizeMatt Arsenault2018-08-121-5/+20
| | | | | | | | | | | If one of the elements is undef, use the canonicalized constant from the other element instead of 0. Splat vectors are more useful for other optimizations, such as matching vector clamps. This was breaking on clamps of half3 from the undef 4th component. llvm-svn: 339512
* AMDGPU: Fix packing undef parts of build_vectorMatt Arsenault2018-08-122-6/+34
| | | | llvm-svn: 339511
* [TargetLowering] Simplify one of the special cases in SimplifyDemandedBits ↵Craig Topper2018-08-121-21/+21
| | | | | | | | for XOR. NFCI We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead. llvm-svn: 339509
* [TargetLowering] Use APInt::isSubsetOf to simplify some code. NFCCraig Topper2018-08-121-1/+1
| | | | llvm-svn: 339508
OpenPOWER on IntegriCloud