summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the ↵Craig Topper2018-08-131-8/+11
| | | | | | | | fp_to_fp16 in case the result type isn't a scalar integer. This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary. llvm-svn: 339536
* [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, ↵Craig Topper2018-08-131-2/+2
| | | | | | | | | | make sure the output type is scalar. For vectors, use a store and load of temporary. Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid. This is basically the opposite case of the root cause of PR38533. llvm-svn: 339535
* Restore correct x86_64 EH encodings in kernel code modelLei Liu2018-08-131-9/+14
| | | | | | | | | | | | | Fixes PR37524. The exception handling encodings for x86_64 in kernel code model has been changed with r309884. Restore it to correct ones. These encodings include PersonalityEncoding, LSDAEncoding and TTypeEncoding. Differential Revision: https://reviews.llvm.org/D50490 llvm-svn: 339534
* [SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the ↵Craig Topper2018-08-131-2/+4
| | | | | | | | | | fp16_to_fp in case the input type isn't an i16. The bitcast can be further legalized as needed. Fixes PR38533. llvm-svn: 339533
* [InstCombine] Fix typo in comment. NFCCraig Topper2018-08-131-1/+1
| | | | llvm-svn: 339532
* [InstCombine] Replace call to haveNoCommonBitsSet in visitXor with just the ↵Craig Topper2018-08-131-2/+8
| | | | | | | | | | | | | | | | special case that doesn't use computeKnownBits. Summary: computeKnownBits is expensive. The cases that would be detected by the computeKnownBits portion of haveNoCommonBitsSet were already handled by the earlier call to SimplifyDemandedInstructionBits. Reviewers: spatel, lebedev.ri Reviewed By: lebedev.ri Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50604 llvm-svn: 339531
* [X86] Add constant folding for AVX512 versions of scalar floating point to ↵Craig Topper2018-08-121-5/+76
| | | | | | | | | | | | | | | | | | | integer conversion intrinsics. Summary: We've supported constant folding for sse versions for many years. This patch adds support for the avx512 versions including unsigned with the default rounding mode. We could probably do more with other roundings modes and SAE in the future. The test cases are largely based on the sse.ll test cases. But I did add some test cases to ensure the unsigned versions don't accept negative values. Also checked the bounds of f64->i32 conversions to make sure unsigned has a larger positive range than signed. Reviewers: RKSimon, spatel, chandlerc Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50553 llvm-svn: 339529
* DAG: Check no-signed-zeros instead of unsafe-fp-mathMatt Arsenault2018-08-121-3/+3
| | | | | | | Addresses fixme, although this should still be checking individual operand flags. llvm-svn: 339525
* [InstCombine] Fold Select with binary op - non-commutative opcodesDavid Bolvansky2018-08-121-2/+3
| | | | | | | | | | | | | | | | | | | Summary: Basic version was merged - https://reviews.llvm.org/D49954 This adds support for FP & non-commutative opcodes Precommited tests: https://reviews.llvm.org/rL338727 Reviewers: spatel, lebedev.ri Reviewed By: spatel Subscribers: jfb Differential Revision: https://reviews.llvm.org/D50190 llvm-svn: 339520
* [InstCombine] fix/enhance fadd/fsub factorizationSanjay Patel2018-08-121-87/+45
| | | | | | | | | | | | | (X * Z) + (Y * Z) --> (X + Y) * Z (X * Z) - (Y * Z) --> (X - Y) * Z (X / Z) + (Y / Z) --> (X + Y) / Z (X / Z) - (Y / Z) --> (X - Y) / Z The existing code that implemented these folds failed to optimize vectors, and it transformed code with multiple uses when it should not have. llvm-svn: 339519
* [InstSimplify] Guard against large shift amounts.Benjamin Kramer2018-08-121-3/+3
| | | | | | | These are always UB, but can happen for large integer inputs. Testing it is very fragile as -simplifycfg will nuke the UB top-down. llvm-svn: 339515
* AMDGPU: Check NSZ MI flag when folding omodMatt Arsenault2018-08-121-4/+6
| | | | | | | | I'm not sure the exact nsz flag combination that is OK. I think as long as it's on either, this is OK. For now just check it on the omod multiply. llvm-svn: 339513
* AMDGPU: Use splat vectors for undefs when folding canonicalizeMatt Arsenault2018-08-121-5/+20
| | | | | | | | | | | If one of the elements is undef, use the canonicalized constant from the other element instead of 0. Splat vectors are more useful for other optimizations, such as matching vector clamps. This was breaking on clamps of half3 from the undef 4th component. llvm-svn: 339512
* AMDGPU: Fix packing undef parts of build_vectorMatt Arsenault2018-08-122-6/+34
| | | | llvm-svn: 339511
* [TargetLowering] Simplify one of the special cases in SimplifyDemandedBits ↵Craig Topper2018-08-121-21/+21
| | | | | | | | for XOR. NFCI We were checking for all bits being Known by checking Known.Zero|Known.One, but if all the bits are known then the value should be a Constant and we can just check for that instead. llvm-svn: 339509
* [TargetLowering] Use APInt::isSubsetOf to simplify some code. NFCCraig Topper2018-08-121-1/+1
| | | | llvm-svn: 339508
* [X86] Remove unnecessary AddedComplexity line. NFCCraig Topper2018-08-121-1/+1
| | | | | | The use of the or_is_add predicate already gives enough of a complexity boost to get the patterns ordered properly. llvm-svn: 339507
* [Dominators] Remove the DeferredDominance classChijun Sima2018-08-111-190/+0
| | | | | | | | | | | | | | Summary: After converting all existing passes to use the new DomTreeUpdater interface, there isn't any usage of the original DeferredDominance class. Thus, we can safely remove it from the codebase. Reviewers: kuhar, brzycki, dmgreen, davide, grosser Reviewed By: kuhar, brzycki Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D49747 llvm-svn: 339502
* [UnJ] Improve explicit loop count checksDavid Green2018-08-111-52/+67
| | | | | | | | | | | | | Try to improve the computed counts when it has been explicitly set by a pragma or command line option. This moves the code around, so that first call to computeUnrollCount to get a sensible count and override that if explicit unroll and jam counts are specified. Also added some extra debug messages for when unroll and jamming is disabled. Differential Revision: https://reviews.llvm.org/D50075 llvm-svn: 339501
* [UnJ] Create a hasInvariantIterationCount function. NFCDavid Green2018-08-112-14/+23
| | | | | | | | | | Pulled out a separate function for some code that calculates if an inner loop iteration count is invariant to it's outer loop. Differential Revision: https://reviews.llvm.org/D50063 llvm-svn: 339500
* [X86] Remove the AL/AX/EAX/RAX short immediate forms from the macro fusion ↵Craig Topper2018-08-111-18/+0
| | | | | | | | shouldScheduleAdjacent. NFC These instructions are only created by the backend during MCInst lowering. llvm-svn: 339499
* [X86] Add the mem-reg form of CMP to the macro fusion shouldScheduleAdjacent.Craig Topper2018-08-111-0/+4
| | | | | | Unlike the other arithmetic instructions the mem-reg form of compare is just a load and not a RMW operation. According to the Intel optimization manual, this form is also supported by macro fusion. llvm-svn: 339498
* [X86] Remove ADD8mi and ADDmr from the macro fusion shouldScheduleAdjacent.Craig Topper2018-08-111-2/+0
| | | | | | The are RMW of memory operations. They aren't eligible for macro fusion. llvm-svn: 339497
* [X86] Change the MOV32ri64 pseudo instruction to def a GR64 directly instead ↵Craig Topper2018-08-112-13/+10
| | | | | | | | | | of wrapping it in a SUBREG_TO_REG. Now we switch to the subregister in expandPostRAPseudos where we already switched the opcode. This simplifies a few isel patterns that used the pseudo directly. And magically seems to have improved our ability to CSE it in the undef-label.ll test. llvm-svn: 339496
* Fix WebAssembly instruction printer after r339474Richard Trieu2018-08-111-1/+5
| | | | | | | | Treat the stack variants of control instructions the same as regular instructions. Otherwise, the vector ControlFlowStack will be the wrong size and have out-of-bounds access. This was detected by MemorySanitizer. llvm-svn: 339495
* AMDGPU/GlobalISel: Define instruction mapping for G_INSERTTom Stellard2018-08-111-0/+14
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D49625 llvm-svn: 339491
* Re-commit "[NFC] More ConstantMerge refactoring"JF Bastien2018-08-101-18/+23
| | | | | | | My previous change moved some code upwards which caused an assert in debug mode because the global value didn't necessarily have an initializer. Don't do that. llvm-svn: 339485
* [LICM] Hoist assumes out of loopsPhilip Reames2018-08-101-0/+9
| | | | | | | | If we have an assume which is known to execute and whose operand is invariant, we can lift that into the pre-header. So long as we don't change which paths the assume executes on, this is a legal transformation. It's likely to be a useful canonicalization as other transforms only look for dominating assumes. Differential Revision: https://reviews.llvm.org/D50364 llvm-svn: 339481
* Revert "[NFC] More ConstantMerge refactoring"JF Bastien2018-08-101-25/+22
| | | | | | Sanitizers seem unhappy. llvm-svn: 339480
* Fix unused lambda capture warning from r339472.Eli Friedman2018-08-101-1/+1
| | | | llvm-svn: 339479
* [NFC] More ConstantMerge refactoringJF Bastien2018-08-101-22/+25
| | | | | | This makes my upcoming patch much easier to read. llvm-svn: 339478
* [WebAssembly] Added default stack-only instruction mode for MC.Wouter van Oortmerssen2018-08-106-256/+487
| | | | | | | | | | | | | | | | | | | | | | | Summary: Moved Explicit Locals pass to last. Made that pass obligatory. Made it convert from register to stack based instructions, and removed the registers. Fixes to related code that was expecting register based instructions. Added the correct testing flag to all tests, depending on what the format they were expecting so far. Translated one test to stack format as example: reg-stackify-stack.ll tested: llvm-lit -v `find test -name WebAssembly` unittests/MC/* Reviewers: dschuff, sunfish Subscribers: jfb, llvm-commits, aheejin, eraman, jgravelle-google, sbc100 Differential Revision: https://reviews.llvm.org/D50568 llvm-svn: 339474
* [ARM] Adjust AND immediates to make them cheaper to select.Eli Friedman2018-08-103-0/+85
| | | | | | | | | | | | | | | | | | | | | | | LLVM normally prefers to minimize the number of bits set in an AND immediate, but that doesn't always match the available ARM instructions. In Thumb1 mode, prefer uxtb or uxth where possible; otherwise, prefer a two-instruction sequence movs+ands or movs+bics. Some potential improvements outlined in ARMTargetLowering::targetShrinkDemandedConstant, but seems to work pretty well already. The ARMISelDAGToDAG fix ensures we don't generate an invalid UBFX instruction due to a larger-than-expected mask. (It's orthogonal, in some sense, but as far as I can tell it's either impossible or nearly impossible to reproduce the bug without this change.) According to my testing, this seems to consistently improve codesize by a small amount by forming bic more often for ISD::AND with an immediate. Differential Revision: https://reviews.llvm.org/D50030 llvm-svn: 339472
* [MS Demangler] Support extern "C" functions.Zachary Turner2018-08-101-24/+48
| | | | | | | | | | | | | | There are two cases we need to support with extern "C" functions. The first is the case of a '9' indicating that the function has no prototype. This occurs when we mangle a symbol inside of an extern "C" function, but not the function itself. The second case is when we have an overloaded extern "C" functions. In this case we emit $$J0 to indicate this. This patch adds support for both of these cases. llvm-svn: 339471
* [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCISanjay Patel2018-08-101-21/+25
| | | | | | | | | | | | | This is a retry of rL339439 with a fix for the problem that caused the original commit to be reverted at rL339446. That problem was that the compare can be integer while the binop is FP or vice-versa, so we need to use the binop type when we ask for the identity constant. A test to guard against the problem was added at rL339453. llvm-svn: 339469
* [MS Demangler] Demangle cv qualifiers on template args.Zachary Turner2018-08-101-0/+4
| | | | | | | | | | | | | Before we wouldn't properly demangle something like Foo<const int>. Template args have a special escape sequence '$$C' that is optional, but if it is present contains qualifiers. So we need to check for this and only if it present, demangle qualifiers before demangling the type. With this fix, we re-enable some tests that were previously marked FIXME. llvm-svn: 339465
* AMDGPU: More canonicalized operationsMatt Arsenault2018-08-102-1/+18
| | | | llvm-svn: 339464
* AMDGPU: Combine and of seto/setuo and fp_classMatt Arsenault2018-08-101-0/+23
| | | | | | Clear the nan (or non-nan) test bits from the mask. llvm-svn: 339462
* AMDGPU: Turn class x, p_zero|n_zero into fcmp oeq x, 0Matt Arsenault2018-08-101-0/+9
| | | | | | The library does use this for some reason. llvm-svn: 339461
* AMDGPU: Match isfinite pattern to class instructionsMatt Arsenault2018-08-101-3/+13
| | | | llvm-svn: 339460
* AMDGPU: Add LLVM_FALLTHROUGHMatt Arsenault2018-08-101-0/+2
| | | | llvm-svn: 339458
* [hwasan] Add -hwasan-with-ifunc flag.Evgeniy Stepanov2018-08-101-6/+19
| | | | | | | | | | | | Summary: Similar to asan's flag, it can be used to disable the use of ifunc to access hwasan shadow address. Reviewers: vitalybuka, kcc Subscribers: srhines, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D50544 llvm-svn: 339447
* [InstCombine] revert r339439 - rearrange code for foldSelectBinOpIdentitySanjay Patel2018-08-101-25/+21
| | | | | | | That was supposed to be NFC, but it exposed a logic hole somewhere that caused bots to fail. llvm-svn: 339446
* [InstCombine] rearrange code for foldSelectBinOpIdentity; NFCISanjay Patel2018-08-101-21/+25
| | | | | | | This should make it easier to folow and to add the planned enhancements such as D50190. llvm-svn: 339439
* [MS Demangler] Properly demangle conversion operators.Zachary Turner2018-08-101-20/+44
| | | | | | | These were completely broken before. We need to handle the 'B' operator tag. llvm-svn: 339436
* [MS Demangler] Fix several issues related to templates.Zachary Turner2018-08-101-34/+90
| | | | | | | | | | | | These were uncovered when porting the mangling tests in ms-templates.cpp from clang/CodeGenCXX over to demangling tests. The main issues fixed here are surrounding integer literal signed and unsignedness, empty array dimensions, and pointer and reference non-type template parameters. Differential Revision: https://reviews.llvm.org/D50512 llvm-svn: 339434
* [ARM] Disallow zexts in ARMCodeGenPrepareSam Parker2018-08-101-165/+109
| | | | | | | | | | | | | | | | | | | Enabling ARMCodeGenPrepare by default caused a whole load of failures. This is due to zexts and truncs not being handled properly. ZExts are messy so it's just easier to disable for now and truncs are allowed only as 'sinks'. I still need to figure out why allowing them as 'sources' causes so many failures. The other main changes are that we are explicit in the types that we converting to, it's now always 'TypeSize'. Type support is also now performed while checking for valid opcodes as it unnecessarily complicated having the checks are different stages. I've moved the tests around too, so we have the zext and truncs in their own file as well as the overflowing opcode tests. Differential Revision: https://reviews.llvm.org/D50518 llvm-svn: 339432
* [X86][SSE] Pull out repeated shift getOpcode() calls. NFCI.Simon Pilgrim2018-08-101-23/+23
| | | | llvm-svn: 339425
* Fix -Wimplicit-fallthrough warning introduced in rL339397.Simon Pilgrim2018-08-101-0/+1
| | | | llvm-svn: 339422
* Rename the cfguard module flag to cfguardtableHans Wennborg2018-08-101-1/+1
| | | | | | | | | | The previous name sounds like it inserts cfguard implementation, but it really just emits the table of address-taken functions. Change the name to better reflect that. Clang will be updated in the next commit. llvm-svn: 339419
OpenPOWER on IntegriCloud