summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [SelectionDAG][ARM][X86] Teach PromoteIntRes_SETCC to do a better job ↵Craig Topper2018-03-155-341/+434
| | | | | | | | | | | | | | | | picking the result type for the setcc. Previously if getSetccResultType returned an illegal type we just fell back to using the default promoted type. This appears to have been to handle the case where for vectors getSetccResultType returns the input type, but the input type itself isn't legal and will need to be promoted. Without the legality check we would never reach a legal type. But just picking the promoted type to be the setcc type can create strange setccs where the result type is 128 bits and the operand type is 256 bits. If for example the result type was promoted to v8i16 from v8i1, but the input type was promoted from v8i23 to v8i32. We currently handle this with custom lowering code in X86. This legality check also caused us reject the getSetccResultType when the input type needed to be widened or split. Even though that result wouldn't have caused legalization to get stuck. This patch tries to fix this by detecting the getSetccResultType needs to be promoted. If its input type also needs to be promoted we'll try a ask for a new setcc result type based on its eventual promoted value. Otherwise we fall back to default type to promote to. For any other illegal values we might get back from the initial call to getSetccResultType we just keep and allow it to be re-legalized later via splitting or widening or scalarizing. llvm-svn: 327683
* [X86][Btver2] Fix ymm div/sqrt to use fmul unitSimon Pilgrim2018-03-152-39/+38
| | | | | | | | YMM FDiv/FSqrt are dispatched on pipe JFPU1 but should be performed on the JFPM unit - that is where most of the cycles are spent. This matches the pipes for WriteFSqrt/WriteFDiv definitions. llvm-svn: 327682
* Use standard `print(dbgs())` pattern to implement DebugLoc::dumpSean Silva2018-03-151-13/+1
| | | | | | The open-coded implementation had a bug. It didn't print filenames. llvm-svn: 327681
* [InstCombine] add tests for fcmp+select -> fabs; NFCSanjay Patel2018-03-151-24/+57
| | | | llvm-svn: 327680
* Fix PDB injected sources test.Zachary Turner2018-03-152-5/+15
| | | | | | | | | This test was originally disabled because it was failing on a bot. It turns out I had run dos2unix on the file, and that removed a necessary byte from the file. I'm just recomitting the proper file and updating the test to test a little bit more now. llvm-svn: 327679
* MSan, FreeBSD few tests fixesVitaly Buka2018-03-152-2/+13
| | | | | | | | | | | | | | | | Summary: pthread_getattr_np_deadlock support pthread_getname_np unsupported Reviewers: krytarowski, vitalybuka Reviewed By: vitalybuka Subscribers: eugenis, srhines, krytarowski, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D44085 llvm-svn: 327678
* OpenBSD UBsan support procmapsVitaly Buka2018-03-154-9/+144
| | | | | | | | | | | | | | | | Summary: procmaps OpenBSD specifics Patch by David CARLIER Reviewers: krytarowski, vitalybuka Reviewed By: vitalybuka Subscribers: mgorny, emaste, kubamracek, fedor.sergeev, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D44050 llvm-svn: 327677
* [X86][Btver2] Add test to show timeline of fpu instructions on different ↵Simon Pilgrim2018-03-151-0/+114
| | | | | | | | pipes/units Try to demonstrate the scheduling from fpu0/fpu1 pipes to the valu0/vimul/fpa or valu1/stc/fpm functional units llvm-svn: 327676
* [PDB] Fix a bug where we were serializing hash tables incorrectly.Zachary Turner2018-03-152-7/+16
| | | | | | | | | | | There was some code that tried to calculate the number of 4-byte words required to hold N bits, but it was instead computing the number of bytes required to hold N bits. This was leading to extraneous data being output into the hash table, which would cause certain operations in DIA (the Microsoft PDB reader) to fail. llvm-svn: 327675
* OpenBSD UBsan support common functionsVitaly Buka2018-03-152-69/+107
| | | | | | | | | | | | | | | | Summary: Ripped off OpenBSD specific from the common Linux implementation Patch by David Carlier Reviewers: krytarowski, vitalybuka Reviewed By: vitalybuka Subscribers: emaste, srhines, kubamracek, fedor.sergeev, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D44036 llvm-svn: 327674
* [WebAssembly] Add DebugLoc information to WebAssembly block and loop.Derek Schuff2018-03-154-8/+161
| | | | | | | Patch by Yury Delendik Differential Revision: https://reviews.llvm.org/D44448 llvm-svn: 327673
* [NVPTX] TblGen-ized lowering of WMMA intrinsics.Artem Belevich2018-03-155-620/+155
| | | | | | | | NFC. Differential Revision: https://reviews.llvm.org/D43151 llvm-svn: 327672
* [LoopUnroll] Peel off iterations if it makes conditions true/false.Florian Hahn2018-03-155-7/+705
| | | | | | | | | | | | | | | If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations. This improves codegen for PR34364. Reviewers: mkuper, mkazantsev, efriedma Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43876 llvm-svn: 327671
* Re-land r327620 "[CodeView] Initial support for emitting S_BLOCK32 symbols ↵Reid Kleckner2018-03-155-14/+514
| | | | | | | | | for lexical scopes" This is safe to land now that we don't copy FunctionInfo when rehashing the DenseMap. llvm-svn: 327670
* [codeview] Fix sense of the assertion about hashtable insertionReid Kleckner2018-03-151-1/+1
| | | | llvm-svn: 327669
* COFF: Implement string tail merging.Peter Collingbourne2018-03-157-5/+184
| | | | | | | | | | | | | | | | | | In COFF, duplicate string literals are merged by placing them in a comdat whose leader symbol name contains a specific prefix followed by the hash and partial contents of the string literal. This gives us an easy way to identify sections containing string literals in the linker: check for leader symbol names with the given prefix. Any sections that are identified in this way as containing string literals may be tail merged. We do so using the StringTableBuilder class, which is also used to tail merge string literals in the ELF linker. Tail merging is enabled only if ICF is enabled, as this provides a signal as to whether the user cares about binary size. Differential Revision: https://reviews.llvm.org/D44504 llvm-svn: 327668
* COFF: Move assignment of section RVAs to assignAddresses(). NFCI.Peter Collingbourne2018-03-154-43/+26
| | | | | | | | | | This makes the design a little more similar to the ELF linker and should allow for features such as ARM range extension thunks to be implemented more easily. Differential Revision: https://reviews.llvm.org/D44501 llvm-svn: 327667
* Fix structure alignment issue.Zachary Turner2018-03-151-4/+0
| | | | llvm-svn: 327666
* [codeview] Delete FunctionInfo copy ctor and move out of DenseMapReid Kleckner2018-03-152-5/+11
| | | | | | | | | | | | | We were unnecessarily copying a bunch of these FunctionInfo objects around when rehashing the DenseMap. Furthermore, r327620 introduced pointers referring to objects owned by FunctionInfo, and the default copy ctor did the wrong thing in this case, leading to use-after-free when the DenseMap gets rehashed. I will rebase r327620 on this next and recommit it. llvm-svn: 327665
* [LICM] Ignore exits provably not taken on first iteration when computing ↵Philip Reames2018-03-153-1/+347
| | | | | | | | | | | | | | | | must execute It is common to have conditional exits within a loop which are known not to be taken on some iterations, but not necessarily all. This patches extends our reasoning around guaranteed to execute (used when establishing whether it's safe to dereference a location from the preheader) to handle the case where an exit is known not to be taken on the first iteration and the instruction of interest *is* known to be taken on the first iteration. This case comes up in two major ways: * If we have a range check which we've been unable to eliminate, we frequently know that it doesn't fail on the first iteration. * Pass ordering. We may have a check which will be eliminated through some sequence of other passes, but depending on the exact pass sequence we might never actually do so or we might miss other optimizations from passes run before the check is finally eliminated. The initial version (here) is implemented via InstSimplify. At the moment, it catches a few cases, but misses a lot too. I added test cases for missing cases in InstSimplify which I'll follow up on separately. Longer term, we should probably wire SCEV through to here to get much smarter loop aware simplification of the first iteration predicate. Differential Revision: https://reviews.llvm.org/D44287 llvm-svn: 327664
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-151-2/+2
| | | | | | Fix typo. llvm-svn: 327663
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-151-0/+7
| | | | | | Add special case for rotate right. llvm-svn: 327662
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-03-152-12/+59
| | | | | | Increase the number of cheap as move cases of register reset. llvm-svn: 327661
* [X86] Make sure we use FSUB instruction as the reference for operand order ↵Craig Topper2018-03-152-9/+31
| | | | | | | | in isAddSubOrSubAdd when recognizing subadd The FADD part of the addsub/subadd pattern can have its operands commuted, but when checking for fsubadd we were using the fadd as reference and commuting the fsub node. llvm-svn: 327660
* [X86] Add test case showing bad fmsubadd creation due to bad commuting.Craig Topper2018-03-151-0/+19
| | | | | | The code that creates fmsubadd from shuffle vector has some code to allow commuting the operands of the fadd node. This code was originally created when we only recognized fmaddsub. When fmsubadd support was added this code was not updated and is now commuting the fsub operands instead. llvm-svn: 327659
* Remove empty fileDavid Blaikie2018-03-151-13/+0
| | | | | | | I should've deleted this in r320768 but accidentally just deleted its contents instead. llvm-svn: 327658
* Revert r327620 "[CodeView] Initial support for emitting S_BLOCK32 symbols ↵Reid Kleckner2018-03-154-510/+14
| | | | | | | | | | for lexical scopes" It is causing crashes when compiling Chrome in debug mode. I'll try to debug it in a second. llvm-svn: 327657
* [LV] Test commit. Removing white space.Diego Caballero2018-03-151-1/+1
| | | | | | This is just to check that I have commit access privilege. llvm-svn: 327656
* [EarlyCSE] Don't hide earler invariant.scopesPhilip Reames2018-03-153-3/+41
| | | | | | | | If we've already established an invariant scope with an earlier generation, we don't want to hide it in the scoped hash table with one with a later generation. I noticed this when working on the invariant-load handling, but it also applies to the invariant.start case as well. Without this change, my previous patch for invariant-load regresses some cases, so I'm pushing this without waiting for review. This is why you don't make last minute tweaks to patches to catch "obvious cases" after it's already been reviewed. Bad Philip! llvm-svn: 327655
* [OPENMP, NVPTX] Improve globalization of the variables captured by value.Alexey Bataev2018-03-155-90/+269
| | | | | | | | | | | | | If the variable is captured by value and the corresponding parameter in the outlined function escapes its declaration context, this parameter must be globalized. To globalize it we need to get the address of the original parameter, load the value, store it to the global address and use this global address instead of the original. Patch improves globalization for parallel|teams regions + functions in declare target regions. llvm-svn: 327654
* Move some function declarations higher so they can be found.Zachary Turner2018-03-151-3/+3
| | | | llvm-svn: 327653
* Add missing #includes.Zachary Turner2018-03-151-0/+2
| | | | llvm-svn: 327652
* [PPC] Avoid non-simple MVT in STBRX optimizationGuozhi Wei2018-03-152-1/+23
| | | | | | | | | | PR35402 triggered this case. It bswap and stores a 48bit value, current STBRX optimization transforms it into STBRX. Unfortunately 48bit is not a simple MVT, there is no PPC instruction to support it, and it can't be automatically expanded by llvm, so caused a crash. This patch detects the non-simple MVT and returns early. Differential Revision: https://reviews.llvm.org/D44500 llvm-svn: 327651
* [X86][Btver2] Attach AES/CLMUL instructions to a scheduler pipeSimon Pilgrim2018-03-151-4/+4
| | | | llvm-svn: 327650
* [X86] Simplify the type legality checking for (FM)ADDSUB/SUBADD matching. NFCICraig Topper2018-03-151-9/+7
| | | | | | Rather than enumerating all specific types, for the DAG combine we can just use TLI::isTypeLegal and an SSE3 check. For the BUILD_VECTOR version we already know the type is legal so we just need to check SSE3. llvm-svn: 327649
* [X86] Fix 80 column violations.Craig Topper2018-03-151-2/+4
| | | | llvm-svn: 327648
* Refactor the PDB HashTable class.Zachary Turner2018-03-158-313/+320
| | | | | | | | | It previously only worked when the key and value types were both 4 byte integers. We now have a use case for a non trivial value type, so we need to extend it to support arbitrary value types, which means templatizing it. llvm-svn: 327647
* [EarlyCSE] Reuse invariant scopes for invariant loadPhilip Reames2018-03-152-13/+48
| | | | | | | | | | This is a follow up to https://reviews.llvm.org/D43716 which rewrites the invariant load handling using the new infrastructure. It's slightly more powerful, but only in somewhat minor ways for the moment. It's not clear that DSE of stores to invariant locations is actually interesting since why would your IR have such a construct to start with? Note: The submitted version is slightly different than the reviewed one. I realized the scope could start for an invariant load which was proven redundant and removed. Added a test case to illustrate that as well. Differential Revision: https://reviews.llvm.org/D44497 llvm-svn: 327646
* Add a comment about ELF spec and the symbol table's sh_info.Rui Ueyama2018-03-151-0/+4
| | | | llvm-svn: 327645
* Split skipIf decorator, the condition is supposed to be ORAdrian Prantl2018-03-151-1/+2
| | | | llvm-svn: 327644
* [dotest] remove confirm_directory_exclusivity remnantsPavel Labath2018-03-151-13/+0
| | | | llvm-svn: 327643
* [InstSimplify] peek through unsigned FP casts for sign-bit compares (PR36682)Roman Lebedev2018-03-152-63/+30
| | | | | | | | | | | | | | | | | | | This pattern came up in PR36682 / D44390 https://bugs.llvm.org/show_bug.cgi?id=36682 https://reviews.llvm.org/D44390 https://godbolt.org/g/oKvT5H See also D44421, D44424 Reviewers: spatel, majnemer, efriedma, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44425 llvm-svn: 327642
* [InstSimplify][NFC] simplifyICmpWithConstant(): refactor GetCompareTy() callsRoman Lebedev2018-03-151-4/+6
| | | | | | Preparation for D44425. llvm-svn: 327641
* [llvm-mca] Simplify code. NFC.Andrea Di Biagio2018-03-153-5/+7
| | | | | | | | Now both method DispatchUnit::checkRAT() and DispatchUnit::canDispatch take as input an Instruction refrence instead of an instruction descriptor. This was requested by Simon in D44488 to simplify the diff. llvm-svn: 327640
* [OpenMP][libomptarget] Enable usage of shared memory slotsGheorghe-Teodor Bercea2018-03-151-15/+1
| | | | | | | | | | | | | | | | | Summary: Allow the runtime to use the existing shared memory statically allocated slots. When a variable is globalized, the underlying memory can be either shared or global memory (both have block-wide visibility). In this case, we allow that the storage to use a limited amount of shared memory that has been statically allocated already. Only if shared memory doesn't prove to be enough do we then invoke malloc() to create a new global memory slot. Reviewers: ABataev, carlo.bertolli, grokos, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44486 llvm-svn: 327639
* [ConstantFolding, InstSimplify] Handle more vector GEPsMatthew Simpson2018-03-154-9/+64
| | | | | | | | | | This patch addresses some additional cases where the compiler crashes upon encountering vector GEPs. This should fix PR36116. Differential Revision: https://reviews.llvm.org/D44219 Reference: https://bugs.llvm.org/show_bug.cgi?id=36116 llvm-svn: 327638
* [OpenMP][libomptarget] Enable multiple frames per global memory slotGheorghe-Teodor Bercea2018-03-153-47/+121
| | | | | | | | | | | | | | Summary: To save on calls to malloc, this patch enables the re-use of pre-allocated global memory slots. Reviewers: ABataev, grokos, carlo.bertolli, caomhin Reviewed By: grokos Subscribers: guansong, openmp-commits Differential Revision: https://reviews.llvm.org/D44470 llvm-svn: 327637
* [OPENMP] Codegen for `omp declare target` construct.Alexey Bataev2018-03-159-14/+184
| | | | | | | | Added initial codegen for device side of declarations inside `omp declare target` construct + codegen for implicit `declare target` functions, which are used in the target regions. llvm-svn: 327636
* [PowerPC] Optimize TLS initial-exec sequence to use X-Form loads/storesZaara Syeda2018-03-153-2/+324
| | | | | | | | | This patch adds new load/store instructions for integer scalar types which can be used for X-Form when fed by add with an @tls relocation. Differential Revision: https://reviews.llvm.org/D43315 llvm-svn: 327635
* Recommit r326946 after reducing CallArgList memory footprintYaxun Liu2018-03-1515-111/+333
| | | | llvm-svn: 327634
OpenPOWER on IntegriCloud