summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* llvm-profdata: Reduce memory usage by using Error callback rather than memberDavid Blaikie2017-07-102-33/+40
| | | | | | | | | | | | | | | | Reduces llvm-profdata memory usage on a large profile from 7.8GB to 5.1GB. The ProfData API now supports reporting all the errors/warnings rather than only the first, though llvm-profdata ignores everything after the first for now to preserve existing behavior. (if there's a desire for other behavior, happy to implement that - but might be as well left for a separate patch) Reviewers: davidxl Differential Revision: https://reviews.llvm.org/D35149 llvm-svn: 307516
* [X86] Relax an assertion when legalizing vector types.Davide Italiano2017-07-091-0/+4
| | | | | | | | | | | | | | WidenVSELECTAndMask can fold (and it folds in this case) so we get a BUILD_VECTOR of constants as mask. convertMask() seems to work fine when the input is a vector of constants, and we still need to call it to extend/add elements at the end. but the current code just asserts on anything but a SETCC or AND/OR/XOR of 2xSETCC. This change was discussed briefly with Simon Pilgrim, who also suggests we might consider dropping this assertion in the future. Fixes PR33715. llvm-svn: 307508
* [X86] Allow GHC calling convention to use YMM and ZMM registersSimon Pilgrim2017-07-091-1/+9
| | | | | | | | | | GHC 8.4 will know how to use YMM and ZMM registers for calls. Submitted on behalf of @bgamari (Ben Gamari) Differential Revision: https://reviews.llvm.org/D34854 llvm-svn: 307504
* Handle ConstantExpr correctly in SelectionDAGBuilderSimon Pilgrim2017-07-092-8/+18
| | | | | | | | | | | | This change fixes a bug in SelectionDAGBuilder::visitInsertValue and SelectionDAGBuilder::visitExtractValue where constant expressions (InsertValueConstantExpr and ExtractValueConstantExpr) would be treated as non-constant instructions (InsertValueInst and ExtractValueInst). This bug resulted in an incorrect memory access, which manifested as an assertion failure in SDValue::SDValue. Fixes PR#33094. Submitted on behalf of @Praetonus (Benoit Vey) Differential Revision: https://reviews.llvm.org/D34538 llvm-svn: 307502
* [PM] Fix a nasty bug in the new PM where we failed to properlyChandler Carruth2017-07-092-21/+50
| | | | | | | | | | | | | | | | | | | | | | | | invalidation of analyses when merging SCCs. While I've added a bunch of testing of this, it takes something much more like the inliner to really trigger this as you need to have partially-analyzed SCCs with updates at just the right time. So I've added a direct test for this using the inliner and verifying the domtree. Without the changes here, this test ends up finding a stale dominator tree. However, to handle this properly, we need to invalidate analyses *before* merging the SCCs. After talking to Philip and Sanjoy about this they convinced me this was the right approach. To do this, we need a callback mechanism when merging SCCs so we can observe the cycle that will be merged before the merge happens. This API update ended up being surprisingly easy. With this commit, the new PM passes the test-suite again. It hadn't since MemorySSA was enabled for EarlyCSE as that also will find this bug very quickly. llvm-svn: 307498
* [PM] Add unittesting of the call graph update logic with complexChandler Carruth2017-07-091-6/+51
| | | | | | | | | | | dependencies between analyses. This uncovers even more issues with the proxies and the splitting apart of SCCs which are fixed in this patch. I discovered this while trying to add more rigorous testing for a change I'm making to the call graph update invalidation logic. llvm-svn: 307497
* [X86] Remove check for AVX512 support from skylake-avx512 detection in ↵Craig Topper2017-07-091-6/+1
| | | | | | | | getHostCPUName. Users of getHostCPUName should also use getHostCPUFeatures which will take care of making sure avx512 is disabled if the CPU doesn't support it. This is consistent with what we do for other CPUs. llvm-svn: 307495
* [IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing ↵Craig Topper2017-07-0911-35/+30
| | | | | | isIntegerTy(unsigned), but also works for vectors. llvm-svn: 307492
* [IR] Make use of ↵Craig Topper2017-07-0912-49/+40
| | | | | | Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491
* [FastISel] fix a fallback diagnostic.Igor Breger2017-07-091-1/+2
| | | | | | | | | | | | | | Summary: FastISel was marked as failed in case instruction selection succeeded. Reviewers: qcolombet, zvi, rovka, ab Reviewed By: zvi Subscribers: javed.absar, ab, qcolombet, bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D34438 llvm-svn: 307489
* fix trivial typos; NFCHiroshi Inoue2017-07-095-9/+9
| | | | | | sucessor -> successor llvm-svn: 307488
* [PM] Finish implementing and fix a chain of bugs uncovered by testingChandler Carruth2017-07-092-23/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the invalidation propagation logic from an SCC to a Function. I wrote the infrastructure to test this but didn't actually use it in the unit test where it was designed to be used. =[ My bad. Once I actually added it to the test case I discovered that it also hadn't been properly implemented, so I've implemented it. The logic in the FAM proxy for an SCC pass to propagate invalidation follows the same ideas as the FAM proxy for a Module pass, but the implementation is a bit different to reflect the fact that it is forwarding just for an SCC. However, implementing this correctly uncovered a surprising "bug" (it was conservatively correct but relatively very expensive) in how we handle invalidation when splitting one SCC into multiple SCCs. We did an eager invalidation when in reality we should be deferring invaliadtion for the *current* SCC to the CGSCC pass manager and just invaliating the newly constructed SCCs. Otherwise we end up invalidating too much too soon. This was exposed by the inliner test case that I've updated. Now, we invalidate *just* the split off '(test1_f)' SCC when doing the CG update, and then the inliner finishes and invalidates the '(test1_g, test1_h)' SCC's analyses. The first few attempts at fixing this hit still more bugs, but all of those are covered by existing tests. For example, the inliner should also preserve the FAM proxy to avoid unnecesasry invalidation, and this is safe because the CG update routines it uses handle any necessary adjustments to the FAM proxy. Finally, the unittests for the CGSCC pass manager needed a bunch of updates where we weren't correctly preserving the FAM proxy because it hadn't been fully implemented and failing to preserve it didn't matter. Note that this doesn't yet fix the current crasher due to MemSSA finding a stale dominator tree, but without this the fix to that crasher doesn't really make any sense when testing because it relies on the proxy behavior. llvm-svn: 307487
* [InstCombine] Speculatively implement a fix for what might be the root cause ↵Craig Topper2017-07-091-1/+2
| | | | | | | | | | of PR33721 by making sure that we have integer types before doing select C, -1, 0 -> sext C to int I recently changed m_One and m_AllOnes to use Constant::isOneValue/isAllOnesValue which work on floating point values too. The original implementation looked specifically for ConstantInt scalars and splats. So I'm guessing we are accidentally trying to issue sext/zexts on floating point types now. Hopefully I figure out how to reproduce the failure from the PR soon. llvm-svn: 307486
* [AMDGPU] Fix -Wimplicit-fallthrough warning. NFCI.Simon Pilgrim2017-07-081-6/+2
| | | | llvm-svn: 307485
* [AArch64] Fix -Wimplicit-fallthrough warnings. NFCI.Simon Pilgrim2017-07-081-2/+6
| | | | | | Add breaks - doesn't affect results as both GPR/FPU both check for 32/64 bit sizes. So will still default to GenericOps in the same way. llvm-svn: 307484
* [ARM] Fix -Wimplicit-fallthrough warning. NFCI.Simon Pilgrim2017-07-081-0/+1
| | | | llvm-svn: 307480
* [Bash-autocompletion] Auto complete cc1 options if -cc1 is specifiedYuka Takahashi2017-07-081-1/+5
| | | | | | | | | | | | | | Summary: We don't want to autocomplete flags whose Flags class has `NoDriverOption` when argv[1] is not `-cc1`. Another idea for this implementation is to make --autocomplete a cc1 option and handle it in clang Frontend, by porting --autocomplete handler from Driver to Frontend, so that we can handle Driver options and CC1 options in unified manner. Differential Revision: https://reviews.llvm.org/D34770 llvm-svn: 307479
* Re-enable "[IndVars] Canonicalize comparisons between non-negative values ↵Max Kazantsev2017-07-081-0/+11
| | | | | | | | | | | | | | and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477
* Fix -Wimplicit-fallthrough warning. NFCI.Simon Pilgrim2017-07-081-0/+1
| | | | llvm-svn: 307473
* [x86] add SBB optimization for SETBE (ule) condition codeSanjay Patel2017-07-081-1/+13
| | | | | | | | | | | | | | | | | | | | | | | x86 scalar select-of-constants (Cond ? C1 : C2) combining/lowering is a mess with missing optimizations. We handle some patterns, but miss logical variants. To clean that up, we should convert all select-of-constants to logic/math and enhance the combining for the expected patterns from that. Selecting 0 or -1 needs extra attention to produce the optimal code as shown here. Attempt to verify that all of these IR forms are logically equivalent: http://rise4fun.com/Alive/plxs Earlier steps in this series: rL306040 rL306072 rL307404 (D34652) As acknowledged in the earlier review, there's a possibility that some Intel uarch would prefer to produce an xor to clear the fake register operand with sbb %eax, %eax. This will likely need to be addressed in a separate pass. llvm-svn: 307471
* [Solaris] get rid of _RESTRICT_KYWD warning during the buildKamil Rytarowski2017-07-081-3/+0
| | | | | | | | | | | | | | | | | | Summary: (re)definition of _RESTRICT_KYWD rightfully causes a warning message during the Solaris build. This hack is not needed if build compiler is properly configured (.e.g /usr/bin/gcc) so just remove it. Reviewers: ro, mgorny, krytarowski, joerg Reviewed By: joerg Subscribers: quenelle, llvm-commits Patch by Fedor Sergeev (Oracle). Differential Revision: https://reviews.llvm.org/D35054 llvm-svn: 307469
* [X86] In getHostCPUName, remove some code that changes some AMD CPU names ↵Craig Topper2017-07-081-15/+1
| | | | | | | | | | | | based on features not being enabled. The CPU name is really just used for scheduler and other microarchitectural optimizations. The feature flags should be determined by getHostCPUFeatures which should always be used with getHostCPUName. Trying to alter CPU name strings to control features just isn't practical. Most of these types of things were removed from Intel CPUs a while ago. This is part of my plan to bring compiler-rt's cpu_model.c file up to date with the equivalent functionality in libgcc. A lot of the code in that file is copied from Host.cpp and we want to keep them reasonably in sync. llvm-svn: 307467
* [X86] Correct the BDVER4 model numbers to include 0x70-0x7f.Craig Topper2017-07-081-1/+1
| | | | | | According to wikipedia and some other googling suggests these should also be considered as BDVER4. llvm-svn: 307466
* [X86] Minor formatting fix. NFCCraig Topper2017-07-081-4/+2
| | | | llvm-svn: 307465
* [X86] Use 'unsigned' instead of 'unsigned int' for consistency in the X86 ↵Craig Topper2017-07-081-7/+7
| | | | | | portion of Host.cpp. llvm-svn: 307463
* [X86] Cleanup some CPUID usage in getAvailableFeatures.Craig Topper2017-07-081-5/+10
| | | | | | We should make sure leaf 1 is available before accessing it. Same with leaf 0x80000001. llvm-svn: 307462
* Revert "Revert "Revert "Revert "Switch external cvtres.exe for llvm's own ↵Eric Beckmann2017-07-082-17/+11
| | | | | | | | | | | | | | | | | | | | resource library."""" This reverts commit 147f45ff24456aea59575fa4ac16c8fa554df46a. Revert "Revert "Revert "Revert "Replace trivial use of external rc.exe by writing our own .res file."""" This reverts commit 61a90a67ed54a1f0dfeab457b65abffa129569e4. The patches were intially reverted because they were causing a failure on CrWinClangLLD. Unfortunately, this was done haphazardly and didn't compile, so the revert was reverted again quickly to fix this. One that was done, the revert of the revert was itself reverted. This allowed me to finally fix the actual bug in r307452. This patch re-enables the code path that had originally been causing the bug, now that it (should) be fixed. llvm-svn: 307460
* Remove a variable that was only used in asserts and had a duplicate copy in ↵Eric Christopher2017-07-081-3/+2
| | | | | | something we did use anyhow. llvm-svn: 307457
* Add name offset flags, for parity with cvtres.exe.Eric Beckmann2017-07-071-2/+2
| | | | | | | | | | | | | | | | | | | Summary: The original cvtres.exe sets the high bit when an identifier offset points to a string. Even though this is not mentioned in the spec, and in fact does not seem to cause errors with most cases, for some reason this causes a failure in Chromium where the new resource file is not verified as a new version. This patch sets this high bit flag, and also adds a test case to check that the output of our library is always identical to original cvtres. Reviewers: zturner, ruiu Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D35099 llvm-svn: 307452
* [InstCombine] Make InstCombine's IRBuilder be passed by reference everywhereCraig Topper2017-07-0714-793/+778
| | | | | | | | Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference. This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do. llvm-svn: 307451
* [PowerPC] NFC : Common up definitions of isIntS16Immediate and update ↵Lei Huang2017-07-073-30/+14
| | | | | | parameter to int16_t llvm-svn: 307442
* Increase the import-threshold for crtical functions.Dehao Chen2017-07-072-1/+9
| | | | | | | | | | | | | | Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D35096 llvm-svn: 307439
* Add sample PGO support to ThinLTO new pass manager.Dehao Chen2017-07-071-13/+28
| | | | | | | | | | | | | | | | | | | | | Summary: For SamplePGO + ThinLTO, because profile annotation is done twice at both PrepareForThinLTO pipeline and backend compiler, the following changes are needed at the PrepareForThinLTO phase to ensure the IR is not changed dramatically. Otherwise the profile annotation will be inaccurate in the backend compiler. * disable hot-caller heuristic * disable loop unrolling * disable indirect call promotion This will unblock the new PM testing for sample PGO (tools/clang/test/CodeGen/pgo-sample-thinlto-summary.c), which will be covered in another cfe patch. Reviewers: chandlerc, tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, Prazek, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D34895 llvm-svn: 307437
* [PDB] More changes to bring lld PDBs to parity with MSVC.Zachary Turner2017-07-072-4/+4
| | | | | | | | | | | | | | | | | | | 1) Don't write a /src/headerblock stream. This appears to be written conditionally by MSVC, but it's not clear what the condition is. For now, just remove it since we dont' know what it is anyway and the particular pdb we've checked in for the test doesn't have one. 2) Write a valid timestamp for the PDB file signature. This leads to non-reproducible builds, but it matches the default behavior of link, so it should be out default as well. If we need reproducibility, we should add a separate command line option for it that is off by default. 3) Write an empty FPO stream. MSVC seems to always write an FPO stream. This change makes the stream directory match up, although we still need to make the contents of the FPO stream match. llvm-svn: 307436
* [LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog ↵Anna Thomas2017-07-071-2/+0
| | | | | | | | | | | | | | | remainder generated With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic is in place to support multiple exit/exiting blocks when prolog remainder is generated. This patch removed the assert that multiple exit blocks unrolling is only supported when epilog remainder is generated. Also, added test runs and checks with PROLOG prefix in runtime-loop-multiple-exits.ll test cases. llvm-svn: 307435
* [DAGCombiner] use local variable to shorten code; NFCISanjay Patel2017-07-071-36/+31
| | | | llvm-svn: 307429
* [RegAllocFast] Don't insert kill flags of super-register for partial killQuentin Colombet2017-07-071-2/+9
| | | | | | | | | | | | | | | | | When reusing a register for a new definition, the fast register allocator used to insert a kill flag at the previous last use of that register to inform later passes that this register is free between the redef and the last use. However, this may be wrong when subregisters are involved. Indeed, a partially redef would have trigger a kill of the full super register, potentially wrongly marking all the other subregisters as free. Given we don't track which lanes are still live, we cannot set the kill flag in such case. Note: This bug has been latent for about 7 years (r104056). llvmg.org/PR33677 llvm-svn: 307428
* [RegAllocFast] Add the proper initialize method to use the .mir infrastructureQuentin Colombet2017-07-072-0/+3
| | | | | | NFC llvm-svn: 307427
* [Local] Update the comment for removeUnreachableBlocks.Davide Italiano2017-07-071-2/+3
| | | | | | | It referenced a wrong function name, and didn't mention what the second argument did. This should be slightly more accurate now. llvm-svn: 307425
* FuzzerUtilDarwin.cpp: We need to pass modifiable strings to posix_spawnMatthias Braun2017-07-071-2/+11
| | | | | | | | This fixes a bug where unmodifiable strings where passed to posix_spawn. This is an attempt to unbreak the greendragon libFuzzer bot. llvm-svn: 307424
* Fix some differences between lld and MSVC generated PDBs.Zachary Turner2017-07-073-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A couple of things were different about our generated PDBs. 1) We were outputting the wrong Version on the PDB Stream. The version we were setting was newer than what MSVC is setting. It's not clear what the implications are, but we change LLD to use PdbImplVC70, as MSVC does. 2) For the optional debug stream indices in the DBI Stream, we were outputting 0 to mean "the stream is not present". MSVC outputs uint16_t(-1), which is the "correct" way to specify that a stream is not present. So we fix that as well. 3) We were setting the PDB Stream signature to 0. This is supposed to be the result of calling time(nullptr). Although this leads to non-deterministic builds, a better way to solve that is by having a command line option explicitly for generating a reproducible build, and have the default behavior of lld-link match the default behavior of link. To test this, I'm making use of the new and improved `pdb diff` sub command. To make it suitable for writing tests against, I had to modify the diff subcommand slightly to print less verbose output. Previously it would always print | <column> | <value1> | <value2> | which is quite verbose, and the values are fragile. All we really want to know is "did we produce the same value as link?" So I added command line options to print a single character representing the result status (different, identical, equivalent), and another to hide the value display. Note that just inspecting the diff output used to write the test, you can see some things that are obviously wrong. That is just reflective of the fact that this is the state of affairs today, not that we're asserting that this is "correct". We can use this as a starting point to discover differences, fix them, and update the test. Differential Revision: https://reviews.llvm.org/D35086 llvm-svn: 307422
* [llvm-pdbutil] Improve diff mode.Zachary Turner2017-07-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We're getting to the point that some MS tools (e.g. DIA) can recognize our PDBs but others (e.g. link.exe) cannot. I think the way forward is to improve our tooling to help us find differences more easily. For example, if we can compile the same program with clang-cl and cl and have a tool tell us all the places where the PDBs differ, this could tell us what we're doing wrong. It's tricky though, because there are a lot of "benign" differences in a PDB. For example, if the string table in one PDB consists of "foo" followed by "bar" and in the other PDB it consists of "bar" followed by "foo", this is not necessarily a critical difference, as long as the uses of these strings also refer to the correct location. On the other hand, if the second PDB doesn't even contain the string "foo" at all, this is a critical difference. diff mode has been in llvm-pdbutil for quite a while, but because of the above challenge along with some others, it's been hard to make it useful. I think this patch addresses that. It looks for all the same things, but it now prints the output in tabular format (carefully formatted and aligned into tables and fields), and it highlights critical differences in red, non-critical differences in yellow, and identical fields in green. This makes it easy to spot the places we differ, and the general concept of outputting arbitrary fields in tabular format can be extended to provide analysis into many of the different types of information that show up in a PDB. Differential Revision: https://reviews.llvm.org/D35039 llvm-svn: 307421
* [cloning] Do not duplicate types when cloning functionsGor Nishanov2017-07-071-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is an addon to the change rl304488 cloning fixes. (Originally rl304226 reverted rl304228 and reapplied rl304488 https://reviews.llvm.org/D33655) rl304488 works great when DILocalVariables that comes from the inlined function has a 'unique-ed' type, but, in the case when the variable type is distinct we will create a second DILocalVariable in the scope of the original function that was inlined. Consider cloning of the following function: ``` define private void @f() !dbg !5 { %1 = alloca i32, !dbg !11 call void @llvm.dbg.declare(metadata i32* %1, metadata !14, metadata !12), !dbg !18 ret void, !dbg !18 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) ; came from an inlined function !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Without this fix, when function 'f' is cloned, we will create another DILocalVariable for "inlined", due to its type being distinct. ``` define private void @f.1() !dbg !23 { %1 = alloca i32, !dbg !26 call void @llvm.dbg.declare(metadata i32* %1, metadata !28, metadata !12), !dbg !30 ret void, !dbg !30 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ; !28 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !29) ; OOPS second DILocalVariable !29 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Now we have two DILocalVariable for "inlined" within the same scope. This result in assert in AsmPrinter/DwarfDebug.h:131: void llvm::DbgVariable::addMMIEntry(const llvm::DbgVariable &): Assertion `V.Var == Var && "conflicting variable"' failed. (Full example: See: https://bugs.llvm.org/show_bug.cgi?id=33492) In this change we prevent duplication of types so that when a metadata for DILocalVariable is cloned it will get uniqued to the same metadate node as an original variable. Reviewers: loladiro, dblaikie, aprantl, echristo Reviewed By: loladiro Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D35106 llvm-svn: 307418
* [LoopUnrollRuntime] NFC: use the precomputed loop exit in ConnectPrologAnna Thomas2017-07-071-11/+11
| | | | | | | | | Minor refactoring to use the preexisting loop exit that's already calculated. We do not need to recompute the loop exit in ConnectProlog. Apart from avoiding redundant computation, this is required for supporting multiple loop exits when Prolog remainder loops are generated. llvm-svn: 307417
* [PPC CodeGen] Expand the bitreverse.i32 intrinsic.Tony Jiang2017-07-072-0/+71
| | | | | | | Differential Revision: https://reviews.llvm.org/D33572 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307413
* Fix some more -Wimplicit-fallthrough warnings. NFCI.Simon Pilgrim2017-07-074-5/+9
| | | | llvm-svn: 307411
* [ARM] Implement interleaved access bug fix from r306334Matthew Simpson2017-07-071-1/+3
| | | | | | | r306334 fixed a bug in AArch64 dealing with wide interleaved accesses having pointer types. The bug also exists in ARM, so this patch copies over the fix. llvm-svn: 307409
* [AMDGPU] Assembler: refactor convert methods (VOP3 and MIMG)Sam Kolton2017-07-072-125/+57
| | | | | | | | | | | | Summary: Simplified converter methods for VOP3 and MIMG. Reviewers: dp, artem.tamazov Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, vpykhtin, t-tye Differential Revision: https://reviews.llvm.org/D35047 llvm-svn: 307407
* Fix variable names. NFC.Rafael Espindola2017-07-071-18/+18
| | | | llvm-svn: 307406
* [x86] add SBB optimization for SETAE (uge) condition codeSanjay Patel2017-07-071-1/+16
| | | | | | | | | | | | | | | | | | | | | | x86 scalar select-of-constants (Cond ? C1 : C2) combining/lowering is a mess with missing optimizations. We handle some patterns, but miss logical variants. To clean that up, we should convert all select-of-constants to logic/math and enhance the combining for the expected patterns from that. DAGCombiner already has the foundation to allow the transforms, so we just need to fill in the holes for x86 math op lowering. Selecting 0 or -1 needs extra attention to produce the optimal code as shown here. Attempt to verify that all of these IR forms are logically equivalent: http://rise4fun.com/Alive/plxs Earlier steps in this series: rL306040 rL306072 Differential Revision: https://reviews.llvm.org/D34652 llvm-svn: 307404
OpenPOWER on IntegriCloud