summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [SelectionDAG] fix build warning for mismatched signs in compare; NFCSanjay Patel2018-10-301-1/+1
| | | | llvm-svn: 345598
* [SystemZ] Improve isFoldableLoad() for Sub, SDiv and UDiv.Jonas Paulsson2018-10-301-0/+5
| | | | | | | | | | Sub, SDiv and UDiv are not commutative, so only the RHS operand can fold a load. This patch adds a check for this. Review: Ulrich Weigand https://reviews.llvm.org/D53791 llvm-svn: 345596
* [X86] Re-enable the machine verifier after fixing more testsFrancis Visoiu Mistrih2018-10-301-4/+0
| | | | | | Was disabled again in r345528. Hopefully this the bots. llvm-svn: 345593
* [llc] Error out when -print-machineinstrs is used with an unknown passFrancis Visoiu Mistrih2018-10-301-9/+11
| | | | | | | | We used to assert instead of reporting an error. PR39494 llvm-svn: 345589
* [SROA] Use offset sizes from the DataLayout instead of the pointer siezes.Nicola Zaghen2018-10-301-6/+6
| | | | | | | | | | This fixes an assertion when constant folding a GEP when the part of the offset was in i32 (IndexSize, as per DataLayout) and part in the i64 (PointerSize) in the newly created test case. Differential Revision: https://reviews.llvm.org/D52609 llvm-svn: 345585
* [X86][BMI1] X86DAGToDAGISel: select BEXTR from x & (-1 >> (32 - y)) patternRoman Lebedev2018-10-302-58/+40
| | | | | | | | | | | | | | | | | | | | | Summary: The final pattern. There is no test changes: * We are looking for the pattern with one-use of it's mask, * If the mask is one-use, D48768 will unfold it into pattern d. * Thus, the tests have extra-use on the mask. * Thus, only the BMI2 BZHI can be tested, and it already worked. * So there is no BMI1 test coverage, we just assume it works since it uses the same codepath. Reviewers: craig.topper, RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53575 llvm-svn: 345584
* [AArch64] Add support for UDF instructionDiogo N. Sampaio2018-10-302-10/+29
| | | | | | | | | | | | | | | | Summary: Add support for AArch64 UDF instruction. UDF - Permanently Undefined generates an Undefined Instruction exception (ESR_ELx.EC = 0b000000). Reviewers: DavidSpickett, javed.absar, t.p.northover Reviewed By: javed.absar Subscribers: nhaehnle, kristof.beyls Differential Revision: https://reviews.llvm.org/D53319 llvm-svn: 345581
* [SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodesSimon Pilgrim2018-10-302-16/+76
| | | | | | | | | | Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy. This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle. Differential Revision: https://reviews.llvm.org/D53760 llvm-svn: 345578
* [DAGCombiner] Improve X div/rem Y fold if single bit element typeDavid Bolvansky2018-10-301-3/+4
| | | | | | | | | | | | | | Summary: Tests by @spatel, thanks Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: sdardis, atanasyan, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D52668 llvm-svn: 345575
* [LegalizeTypes] Teach PromoteIntRes_BITCAST to better handle a bitcast with ↵Craig Topper2018-10-302-8/+22
| | | | | | | | | | | | | | | | vector output type and a vector input type that needs to be widened Summary: Previously if we had a bitcast vector output type that needs promotion and a vector input type that needs widening we would just do a stack store and load to handle the conversion. We can do a little better if we can widen the bitcast to a legal vector type the same size as the widened input type. Then we can do the bitcast between this widened type and the widened input type. Afterwards we can extract_subvector back to the original output and any_extend that. Type legalization will then circle back and handle promotion of the extract_subvector and the any_extend will just be removed. This will avoid going through the stack and allows us to remove a custom version of this legalization from X86. Reviewers: efriedma, RKSimon Reviewed By: efriedma Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D53229 llvm-svn: 345567
* [X86] Cleanup the code in LowerFABSorFNEG and LowerFCOPYSIGN a little. NFCCraig Topper2018-10-301-30/+20
| | | | | | Use SelectionDAG::EVTToAPFloatSemantics. Make the LogicVT calculation in LowerFABSorFNEG similar to LowerFCOPYSIGN. Use APInt::getSignedMaxValue instead of ~APInt::getSignMask. llvm-svn: 345565
* [X86] Stop changing f128 fand/for/fxor to v2i64.Craig Topper2018-10-302-21/+38
| | | | | | The additional patterns don't cost us much and it seems better than changing element widths. llvm-svn: 345564
* AMDGPU: Remove custom BUILD_VECTOR combineMatt Arsenault2018-10-302-46/+0
| | | | | | | This was looping in a testcase and removing it now slightly improves a test. llvm-svn: 345560
* AMDGPU: Use scavengeRegisterBackwardsMatt Arsenault2018-10-301-2/+3
| | | | llvm-svn: 345559
* Remove dead declarationMatt Arsenault2018-10-301-7/+0
| | | | llvm-svn: 345555
* Pass TRI to printRegMatt Arsenault2018-10-301-1/+1
| | | | llvm-svn: 345553
* Remove unneeded friend declarations that clang-cl warns onReid Kleckner2018-10-292-4/+0
| | | | llvm-svn: 345549
* [AliasSetTracker] Cleanup addPointer interface. [NFCI]Alina Sbirlea2018-10-291-6/+6
| | | | | | | | | | | | | | Summary: Attempting to simplify the addPointer interface. Currently there's code decomposing a MemoryLocation into (Ptr, Size, AAMDNodes) only to recreate the MemoryLocation inside the call. Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D53836 llvm-svn: 345548
* [DWARF][NFC] Refactor range list extraction and dumpingWolfgang Pieb2018-10-297-229/+214
| | | | | | | | | | | | | | | | | The purpose of this patch is twofold: - Fold pre-DWARF v5 functionality into v5 to eliminate the need for 2 different versions of range list handling. We get rid of DWARFDebugRangelist{.cpp,.h}. - Templatize the handling of range list tables so that location list handling can take advantage of it as well. Location list and range list tables have the same basic layout. A non-NFC version of this patch was previously submitted with r342218, but it caused errors with some TSan tests. This patch has no functional changes. The difference to the non-NFC patch is that there are no changes to rangelist dumping in this patch. Differential Revision: https://reviews.llvm.org/D53545 llvm-svn: 345546
* Add parens to fix incorrect assert check.Erich Keane2018-10-291-1/+1
| | | | | | | | && has higher priority than ||, so this assert works really oddly. Add parens to match the programmer's intent. Change-Id: I3abe1361ee0694462190c5015779db664012f3d4 llvm-svn: 345543
* AMDGPU: Enable code object v3 by defaultKonstantin Zhuravlyov2018-10-291-15/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D53525 llvm-svn: 345542
* [MachineOutliner] Inherit target features from parent functionJessica Paquette2018-10-291-0/+8
| | | | | | | | | | | | | | | If a function has target features, it may contain instructions that aren't represented in the default set of instructions. If the outliner pulls out one of these instructions, and the function doesn't have the right attributes attached, we'll run into an LLVM error explaining that the target doesn't support the necessary feature for the instruction. This makes outlined functions inherit target features from their parents. It also updates the machine-outliner.ll test to check that we're properly inheriting target features. llvm-svn: 345535
* [X86] Set isMachineVerifierClean() back to false (PR27481)Simon Pilgrim2018-10-291-0/+4
| | | | | | Put back the isMachineVerifierClean() override removed at rL345513 to fix Windows ThinLTO tests llvm-svn: 345528
* [HotColdSplitting] Allow outlining single-block cold regionsVedant Kumar2018-10-291-3/+20
| | | | | | | | | | | | | | | | It can be profitable to outline single-block cold regions because they may be large. Allow outlining single-block regions if they have over some threshold of non-debug, non-terminator instructions. I chose 3 as the threshold after experimenting with several internal frameworks. In practice, reducing the threshold further did not give much improvement, whereas increasing it resulted in substantial regressions. Differential Revision: https://reviews.llvm.org/D53824 llvm-svn: 345524
* [WebAssembly] Lower away condition truncations for scalar selectsThomas Lively2018-10-292-0/+14
| | | | | | | | | | Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53676 llvm-svn: 345521
* [X86][SSE] getFauxShuffleMask - Fix shuffle mask adjustment for multiple ↵Simon Pilgrim2018-10-291-4/+3
| | | | | | | | inserted subvectors Part of the issue discovered in PR39483, although its not fully exposed until I reapply rL345395 (by reverting rL345451) llvm-svn: 345520
* [X86] Add AES to KNL CPUs to match clang.Craig Topper2018-10-291-0/+1
| | | | | | I believe this was lost from KNL when AES was pushed from Westmere to Skylake recently. KNL used to inherit from IVB. llvm-svn: 345519
* [AMDGPU] Fixed return value causing warning and regressionStanislav Mekhanoshin2018-10-291-1/+1
| | | | llvm-svn: 345518
* [AArch64] Rename FP16FML instruction format (NFC)Bryan Chan2018-10-292-72/+78
| | | | | | | Rename SIMDThreeSameMult (etc.) to SIMDThreeSameVectorFML (etc.) to follow usual naming convention, and add some comments in the .td files. llvm-svn: 345515
* [AMDGPU] Match v_swap_b32Stanislav Mekhanoshin2018-10-292-0/+175
| | | | | | Differential Revision: https://reviews.llvm.org/D52677 llvm-svn: 345514
* [X86] Enable the MachineVerifier by defaultFrancis Visoiu Mistrih2018-10-291-4/+0
| | | | | | | | | | | | | | | The machine verifier was disabled for x86 by default. There are now only 9 tests failing, compared to what previously was between 20 and 30. This is a good opportunity to file bugs for all the remaining issues, then explicitly disable the failing tests and enabling the machine verifier by default. This allows us to avoid adding new tests that break the verifier. PR27481 llvm-svn: 345513
* [Intrinsic] Signed and Unsigned Saturation Subtraction IntirnsicsLeonard Chan2018-10-2910-38/+101
| | | | | | | | | | | | Add an intrinsic that takes 2 integers and perform saturation subtraction on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D53783 llvm-svn: 345512
* [AArch64] Return address signing B key supportLuke Cheeseman2018-10-291-3/+20
| | | | | | | | | | | - Add support to generate AUTIBSP, PACIBSP, RETAB instructions for return address signing - The key used to sign the function is controlled by the function attribute "sign-return-address-key" Differential Revision: https://reviews.llvm.org/D51427 llvm-svn: 345511
* [LLVM-C] Add Builder Bindings to Common Memory IntrinsicsRobert Widmann2018-10-291-0/+24
| | | | | | | | | | | | | | Summary: Add IRBuilder bindings for memmove, memcpy, and memset. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: harlanhaskins, llvm-commits Differential Revision: https://reviews.llvm.org/D53555 llvm-svn: 345508
* [X86] Force floating point values in constant pool decoding to print in ↵Craig Topper2018-10-291-1/+2
| | | | | | | | | | scientific notation so they can't be confused with integers. When the floating point constants are whole numbers they have no decimal point so look like integers, but mean something very different in something like an 'and' instruction. Ideally we would just print a decimal point and a 0, but I couldn't see how to make APFloat::toString do that. llvm-svn: 345488
* [X86] Recognize constant splats in LowerFCOPYSIGN.Craig Topper2018-10-281-1/+1
| | | | llvm-svn: 345484
* Revert "Revert "DebugInfo: reduce DIE range verification on object files""Saleem Abdulrasool2018-10-281-13/+45
| | | | | | | This reverts commit 836c763dadbd9478fa35b1a291a38bf17aa206ba. Default initialize the values that MSAN caught. llvm-svn: 345482
* [SelectionDAG] Fix bad indentation. NFCCraig Topper2018-10-281-4/+4
| | | | llvm-svn: 345481
* [TargetLowering] Move i64/vXi64 to f32/vXf32 UINT_TO_FP handling to ↵Simon Pilgrim2018-10-282-56/+72
| | | | | | TargetLowering::expandUINT_TO_FP. llvm-svn: 345478
* [VectorLegalizer] Enable TargetLowering::expandFP_TO_UINT support.Simon Pilgrim2018-10-283-11/+41
| | | | | | | | Add vector support to TargetLowering::expandFP_TO_UINT. This exposes an issue in X86TargetLowering::LowerVSELECT which was assuming that the select mask was the same width as the LHS/RHS ops - as long as the result is a sign splat we can easily sext/trunk this. llvm-svn: 345473
* [DAGCombiner] Better constant vector support for FCOPYSIGN.Craig Topper2018-10-281-4/+4
| | | | | | | | Enable constant folding when both operands are vectors of constants. Turn into FNEG/FABS when the RHS is a splat constant vector. llvm-svn: 345469
* Revert r344172: [LV] Add a new reduction pattern matchRenato Golin2018-10-271-65/+6
| | | | | | | | This patch has caused fast-math issues in the reduction pattern. Will re-work and land again. llvm-svn: 345465
* AMD BdVer2 (Piledriver) Initial Scheduler modelRoman Lebedev2018-10-273-2/+1293
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: # Overview This is somewhat partial. * Latencies are good {F7371125} * All of these remaining inconsistencies //appear// to be noise/noisy/flaky. * NumMicroOps are somewhat good {F7371158} * Most of the remaining inconsistencies are from `Ld` / `Ld_ReadAfterLd` classes * Actual unit occupation (pipes, `ResourceCycles`) are undiscovered lands, i did not really look there. They are basically verbatum copy from `btver2` * Many `InstRW`. And there are still inconsistencies left... To be noted: I think this is the first new schedule profile produced with the new next-gen tools like llvm-exegesis! # Benchmark I realize that isn't what was suggested, but i'll start with some "internal" public real-world benchmark i understand - [[ https://github.com/darktable-org/rawspeed | RawSpeed raw image decoding library ]]. Diff (the exact clang from trunk without/with this patch): ``` Comparing /home/lebedevri/rawspeed/build-old/src/utilities/rsbench/rsbench to /home/lebedevri/rawspeed/build-new/src/utilities/rsbench/rsbench Benchmark Time CPU Time Old Time New CPU Old CPU New ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_mean -0.0607 -0.0604 234 219 233 219 Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_median -0.0630 -0.0626 233 219 233 219 Canon/EOS 5D Mark II/09.canon.sraw1.cr2/threads:8/real_time_stddev +0.2581 +0.2587 1 2 1 2 Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_mean -0.0770 -0.0767 144 133 144 133 Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_median -0.0767 -0.0763 144 133 144 133 Canon/EOS 5D Mark II/10.canon.sraw2.cr2/threads:8/real_time_stddev -0.4170 -0.4156 1 0 1 0 Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_mean -0.0271 -0.0270 463 450 463 450 Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_median -0.0093 -0.0093 453 449 453 449 Canon/EOS 5DS/2K4A9927.CR2/threads:8/real_time_stddev -0.7280 -0.7280 13 4 13 4 Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_pvalue 0.0004 0.0004 U Test, Repetitions: 25 vs 25 Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_mean -0.0065 -0.0065 569 565 569 565 Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_median -0.0077 -0.0077 569 564 569 564 Canon/EOS 5DS/2K4A9928.CR2/threads:8/real_time_stddev +1.0077 +1.0068 2 5 2 5 Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_pvalue 0.0220 0.0199 U Test, Repetitions: 25 vs 25 Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_mean +0.0006 +0.0007 312 312 312 312 Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_median +0.0031 +0.0032 311 312 311 312 Canon/EOS 5DS/2K4A9929.CR2/threads:8/real_time_stddev -0.7069 -0.7072 4 1 4 1 Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_pvalue 0.0004 0.0004 U Test, Repetitions: 25 vs 25 Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_mean -0.0015 -0.0015 141 141 141 141 Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_median -0.0010 -0.0011 141 141 141 141 Canon/EOS 10D/CRW_7673.CRW/threads:8/real_time_stddev -0.1486 -0.1456 0 0 0 0 Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_pvalue 0.6139 0.8766 U Test, Repetitions: 25 vs 25 Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_mean -0.0008 -0.0005 60 60 60 60 Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_median -0.0006 -0.0002 60 60 60 60 Canon/EOS 40D/_MG_0154.CR2/threads:8/real_time_stddev -0.1467 -0.1390 0 0 0 0 Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_pvalue 0.0137 0.0137 U Test, Repetitions: 25 vs 25 Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_mean +0.0002 +0.0002 275 275 275 275 Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_median -0.0015 -0.0014 275 275 275 275 Canon/EOS 77D/IMG_4049.CR2/threads:8/real_time_stddev +3.3687 +3.3587 0 2 0 2 Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_pvalue 0.4041 0.3933 U Test, Repetitions: 25 vs 25 Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_mean +0.0004 +0.0004 67 67 67 67 Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_median -0.0000 -0.0000 67 67 67 67 Canon/PowerShot G1/crw_1693.crw/threads:8/real_time_stddev +0.1947 +0.1995 0 0 0 0 Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_pvalue 0.0074 0.0001 U Test, Repetitions: 25 vs 25 Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_mean -0.0092 +0.0074 547 542 25 25 Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_median -0.0054 +0.0115 544 541 25 25 Fujifilm/GFX 50S/20170525_0037TEST.RAF/threads:8/real_time_stddev -0.4086 -0.3486 8 5 0 0 Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_pvalue 0.3320 0.0000 U Test, Repetitions: 25 vs 25 Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_mean +0.0015 +0.0204 218 218 12 12 Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_median +0.0001 +0.0203 218 218 12 12 Fujifilm/X-Pro2/_DSF3051.RAF/threads:8/real_time_stddev +0.2259 +0.2023 1 1 0 0 GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_pvalue 0.0000 0.0001 U Test, Repetitions: 25 vs 25 GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_mean -0.0209 -0.0179 96 94 90 88 GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_median -0.0182 -0.0155 95 93 90 88 GoPro/HERO6 Black/GOPR9172.GPR/threads:8/real_time_stddev -0.6164 -0.2703 2 1 2 1 Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_mean -0.0098 -0.0098 176 175 176 175 Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_median -0.0126 -0.0126 176 174 176 174 Kodak/DCS Pro 14nx/D7465857.DCR/threads:8/real_time_stddev +6.9789 +6.9157 0 2 0 2 Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_mean -0.0237 -0.0238 474 463 474 463 Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_median -0.0267 -0.0267 473 461 473 461 Nikon/D850/Nikon-D850-14bit-lossless-compressed.NEF/threads:8/real_time_stddev +0.7179 +0.7178 3 5 3 5 Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_pvalue 0.6837 0.6554 U Test, Repetitions: 25 vs 25 Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_mean -0.0014 -0.0013 1375 1373 1375 1373 Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_median +0.0018 +0.0019 1371 1374 1371 1374 Olympus/E-M1MarkII/Olympus_EM1mk2__HIRES_50MP.ORF/threads:8/real_time_stddev -0.7457 -0.7382 11 3 10 3 Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_mean -0.0080 -0.0289 22 22 10 10 Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_median -0.0070 -0.0287 22 22 10 10 Panasonic/DC-G9/P1000476.RW2/threads:8/real_time_stddev +1.0977 +0.6614 0 0 0 0 Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_mean +0.0132 +0.0967 35 36 10 11 Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_median +0.0132 +0.0956 35 36 10 11 Panasonic/DC-GH5/_T012014.RW2/threads:8/real_time_stddev -0.0407 -0.1695 0 0 0 0 Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_mean +0.0331 +0.1307 13 13 6 6 Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_median +0.0430 +0.1373 12 13 6 6 Panasonic/DC-GH5S/P1022085.RW2/threads:8/real_time_stddev -0.9006 -0.8847 1 0 0 0 Pentax/645Z/IMGP2837.PEF/threads:8/real_time_pvalue 0.0016 0.0010 U Test, Repetitions: 25 vs 25 Pentax/645Z/IMGP2837.PEF/threads:8/real_time_mean -0.0023 -0.0024 395 394 395 394 Pentax/645Z/IMGP2837.PEF/threads:8/real_time_median -0.0029 -0.0030 395 394 395 393 Pentax/645Z/IMGP2837.PEF/threads:8/real_time_stddev -0.0275 -0.0375 1 1 1 1 Phase One/P65/CF027310.IIQ/threads:8/real_time_pvalue 0.0232 0.0000 U Test, Repetitions: 25 vs 25 Phase One/P65/CF027310.IIQ/threads:8/real_time_mean -0.0047 +0.0039 114 113 28 28 Phase One/P65/CF027310.IIQ/threads:8/real_time_median -0.0050 +0.0037 114 113 28 28 Phase One/P65/CF027310.IIQ/threads:8/real_time_stddev -0.0599 -0.2683 1 1 0 0 Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_mean +0.0206 +0.0207 405 414 405 414 Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_median +0.0204 +0.0205 405 414 405 414 Samsung/NX1/2016-07-23-142101_sam_9364.srw/threads:8/real_time_stddev +0.2155 +0.2212 1 1 1 1 Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_mean -0.0109 -0.0108 147 145 147 145 Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_median -0.0104 -0.0103 147 145 147 145 Samsung/NX30/2015-03-07-163604_sam_7204.srw/threads:8/real_time_stddev -0.4919 -0.4800 0 0 0 0 Samsung/NX3000/_3184416.SRW/threads:8/real_time_pvalue 0.0000 0.0000 U Test, Repetitions: 25 vs 25 Samsung/NX3000/_3184416.SRW/threads:8/real_time_mean -0.0149 -0.0147 220 217 220 217 Samsung/NX3000/_3184416.SRW/threads:8/real_time_median -0.0173 -0.0169 221 217 220 217 Samsung/NX3000/_3184416.SRW/threads:8/real_time_stddev +1.0337 +1.0341 1 3 1 3 Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_pvalue 0.0001 0.0001 U Test, Repetitions: 25 vs 25 Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_mean -0.0019 -0.0019 194 193 194 193 Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_median -0.0021 -0.0021 194 193 194 193 Sony/DSLR-A350/DSC05472.ARW/threads:8/real_time_stddev -0.4441 -0.4282 0 0 0 0 Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_pvalue 0.0000 0.4263 U Test, Repetitions: 25 vs 25 Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_mean +0.0258 -0.0006 81 83 19 19 Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_median +0.0235 -0.0011 81 82 19 19 Sony/ILCE-7RM2/14-bit-compressed.ARW/threads:8/real_time_stddev +0.1634 +0.1070 1 1 0 0 ``` {F7443905} If we look at the `_mean`s, the time column, the biggest win is `-7.7%` (`Canon/EOS 5D Mark II/10.canon.sraw2.cr2`), and the biggest loose is `+3.3%` (`Panasonic/DC-GH5S/P1022085.RW2`); Overall: mean `-0.7436%`, median `-0.23%`, `cbrt(sum(time^3))` = `-8.73%` Looks good so far i'd say. llvm-exegesis details: {F7371117} {F7371125} {F7371128} {F7371144} {F7371158} Reviewers: craig.topper, RKSimon, andreadb, courbet, avt77, spatel, GGanesh Reviewed By: andreadb Subscribers: javed.absar, gbedwell, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52779 llvm-svn: 345463
* [X86][SSE] LowerVSELECT - pull out repeated getOperand(). NFCI.Simon Pilgrim2018-10-271-11/+13
| | | | llvm-svn: 345458
* Revert "DebugInfo: reduce DIE range verification on object files"Vlad Tsyrklevich2018-10-271-44/+13
| | | | | | | This reverts commits r345441 and r345444, they were causing msan buildbot failures. llvm-svn: 345457
* [Local] Keep K's range if K does not move when combining metadata.Florian Hahn2018-10-271-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As K has to dominate I, IIUC I's range metadata must be a subset of K's. After Eli's recent clarification to the LangRef, loading a value outside of the range is undefined behavior. Therefore if I's range contains elements outside of K's range and we would load one such value, K would cause undefined behavior. In cases like hoisting/sinking, we still want the most generic range over all code paths to/from the hoist/sink point. As suggested in the patches related to D47339, I will refactor the handling of those scenarios and try to decouple it from this function as follow up, once we switched to a similar handling of metadata in most of combineMetadata. I updated some tests checking mostly the merging of metadata to keep the metadata of to dominating load. The most interesting one is probably test8 in test/Transforms/JumpThreading/thread-loads.ll. It contained a comment about the alias metadata preventing us to eliminate the branch, but it seem like the actual problem currently is that we merge the ranges of both loads and cannot eliminate the icmp afterwards. With this patch, we manage to eliminate the icmp, as the range of the first load excludes 8. Reviewers: efriedma, nlopes, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D51629 llvm-svn: 345456
* Fix -Wdocumentation warning. NFCI.Simon Pilgrim2018-10-271-4/+4
| | | | llvm-svn: 345454
* [TargetLowering] Move LegalizeDAG FP_TO_UINT handling to ↵Simon Pilgrim2018-10-272-22/+34
| | | | | | | | TargetLowering::expandFP_TO_UINT. NFCI. First step towards fixing PR17686 and adding vector support. llvm-svn: 345452
* Revert rL345395: [X86][SSE] Move 2-input limit up from getFauxShuffleMask to ↵Simon Pilgrim2018-10-271-2/+6
| | | | | | | | | | resolveTargetShuffleInputs Makes no difference to actual shuffle decoding yet, but merges all the existing limits in one place for when proper support is fixed. ........ Its been reported that this is causing out of trunk failures. llvm-svn: 345451
* [ARM64][Windows] MCLayer support for exception handlingSanjin Sijaric2018-10-2713-43/+748
| | | | | | | | | | Add ARM64 unwind codes to MCLayer, as well SEH directives that will be emitted by the frame lowering patch to follow. We only emit unwind codes into object object files for now. Differential Revision: https://reviews.llvm.org/D50166 llvm-svn: 345450
OpenPOWER on IntegriCloud