summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Don't mark v64i8/v32i16 ISD::SELECT as custom unless they are legal types.Craig Topper2019-06-212-7/+108
| | | | | | | | | We don't have any Custom handling during type legalization. Only operation legalization. Fixes PR42355 llvm-svn: 364093
* [X86] Add avx512bw command lines to avx512-select.llCraig Topper2019-06-211-89/+192
| | | | | | | Prep for fixing PR42355 and ensuring we have coverage of ISD::SELECT for v64i8/v32i16 on KNL and SKX configs. llvm-svn: 364092
* [X86] Add a debug print of the node in the default case for unhandled ↵Craig Topper2019-06-211-0/+4
| | | | | | | | | | opcodes in ReplaceNodeResults. This should be unreachable, but bugs can make it reachable. This adds a debug print so we can see the bad node in the output when the llvm_unreachable triggers. llvm-svn: 364091
* [X86][AVX] Combine INSERT_SUBVECTOR(SRC0, EXTRACT_SUBVECTOR(SRC1)) as shuffleSimon Pilgrim2019-06-216-86/+148
| | | | | | Subvector shuffling often ends up as insert/extract subvector. llvm-svn: 364090
* Revert [test][Driver] Fix Clang :: Driver/cl-response-file.cReid Kleckner2019-06-211-1/+1
| | | | | | | | | This reverts r363985 (git commit d5f16d6cfccc4b0b13b6c01d16c673886d53e695) This test can't use printf on Windows because the path contains backslashes which must not be interpreted as escapes by printf. llvm-svn: 364089
* [clang-scan-deps] print the dependencies to stdoutAlex Lorenz2019-06-215-21/+88
| | | | | | | | and remove the need to use -MD options in the CDB Differential Revision: https://reviews.llvm.org/D63579 llvm-svn: 364088
* Quote path to Python executable in case it has spacesReid Kleckner2019-06-211-3/+13
| | | | | | | | | These days Python 3 is typically installed into C:/Program Files, so cope with that. Similar to r364077 in compiler-rt. llvm-svn: 364087
* [AArch64][GlobalISel] Implement selection support for the new G_JUMP_TABLE ↵Amara Emerson2019-06-214-1/+178
| | | | | | | | | | and G_BRJT ops. With this we can now fully code generate jump tables, which is important for code size. Differential Revision: https://reviews.llvm.org/D63223 llvm-svn: 364086
* [GlobalISel][IRTranslator] Change switch table translation to generate jump ↵Amara Emerson2019-06-216-181/+816
| | | | | | | | | | | | | | | | | | | | | | | | | | tables and range checks. This change makes use of the newly refactored SwitchLoweringUtils code from SelectionDAG to in order to generate jump tables and range checks where appropriate. Much of this code is ported from SDAG with some modifications. We generate G_JUMP_TABLE and G_BRJT instructions when JT opportunities are found. This means that targets which previously relied on the naive one MBB per case stmt translation will now start falling back until they add support for the new opcodes. For range checks, we don't generate any previously unused operations. This just recognizes contiguous ranges of case values and generates a single block per range. Single case value blocks are just a special case of ranges so we get that support almost for free. There are still some optimizations missing that I haven't ported over, and bit-tests are also unimplemented. This patch series is already complex enough. Actual arm64 support for selection of jump tables is coming in a later patch. Differential Revision: https://reviews.llvm.org/D63169 llvm-svn: 364085
* [SLP] Look-ahead operand reordering heuristic.Simon Pilgrim2019-06-212-93/+276
| | | | | | | | | | This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364084
* [NFC] Update shl-sub testsDavid Bolvansky2019-06-211-15/+14
| | | | llvm-svn: 364083
* [InstCombine] add tests for ctpop folds; NFCSanjay Patel2019-06-211-0/+117
| | | | llvm-svn: 364082
* Fix ARM buildbot.Richard Smith2019-06-211-4/+4
| | | | llvm-svn: 364081
* [OPENMP]Fix PR42068: Vla type is not captured.Alexey Bataev2019-06-212-9/+40
| | | | | | | | If the variably modified type is declared outside of the captured region and then used in the cast expression along with array subscript expression, the type is not captured and it leads to the compiler crash. llvm-svn: 364080
* [X86] Use vmovq for v4i64/v4f64/v8i64/v8f64 vzmovl.Craig Topper2019-06-217-90/+60
| | | | | | | | | | | | | | We already use vmovq for v2i64/v2f64 vzmovl. But we were using a blendpd+xorpd for v4i64/v4f64/v8i64/v8f64 under opt speed. Or movsd+xorpd under optsize. I think the blend with 0 or movss/d is only needed for vXi32 where we don't have an instruction that can move 32 bits from one xmm to another while zeroing upper bits. movq is no worse than blendpd on any known CPUs. llvm-svn: 364079
* Ensure that top-level QualType objects also have a "kind" field when dumping ↵Aaron Ballman2019-06-212-0/+73
| | | | | | the AST to JSON. llvm-svn: 364078
* [asan] Quote the path to the Python exe in case it has spacesReid Kleckner2019-06-211-2/+10
| | | | | | | | | | | | | These days, Python 3 installs itself into Program Files, so it often has spaces. At first, I resisted this, and I reinstalled it globally into C:/Python37, similar to the location used for Python 2.7. But then I updated VS 2019, and it uninstalled my copy of Python and installed a new one inside "C:/Program Files (x86)/Microsoft Visual Studio/". At this point, I gave up and switched to using its built-in version of Python. However, now these tests fail, and have to be made aware of the possibility of spaces in paths. :( llvm-svn: 364077
* [DAGCombine] narrowExtractedVectorBinOp - pull out repeated getOpcode(). NFCI.Simon Pilgrim2019-06-211-2/+2
| | | | llvm-svn: 364076
* [AArch64][GlobalISel] Make s8 and s16 G_CONSTANTs legal.Amara Emerson2019-06-2116-158/+192
| | | | | | | | | | | | | | | | | | | | | We sometimes get poor code size because constants of types < 32b are legalized as 32 bit G_CONSTANTs with a truncate to fit. This works but means that the localizer can no longer sink them (although it's possible to extend it to do so). On AArch64 however s8 and s16 constants can be selected in the same way as s32 constants, with a mov pseudo into a W register. If we make s8 and s16 constants legal then we can avoid unnecessary truncates, they can be CSE'd, and the localizer can sink them as normal. There is a caveat: if the user of a smaller constant has to widen the sources, we end up with an anyext of the smaller typed G_CONSTANT. This can cause regressions because of the additional extend and missed pattern matching. To remedy this, there's a new artifact combiner to generate the wider G_CONSTANT if it's legal for the target. Differential Revision: https://reviews.llvm.org/D63587 llvm-svn: 364075
* [AMDGPU] hazard recognizer for fp atomic to s_denorm_modeStanislav Mekhanoshin2019-06-2110-28/+559
| | | | | | | | | This requires 3 wait states unless there is a wait or VALU in between. Differential Revision: https://reviews.llvm.org/D63619 llvm-svn: 364074
* [InstCombine] (1 << (C - x)) -> ((1 << C) >> x) if C is bitwidth - 1David Bolvansky2019-06-212-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: ``` %a = sub i32 31, %x %r = shl i32 1, %a => %d = shl i32 1, 31 %r = lshr i32 %d, %x Done: 1 Optimization is correct! ``` https://rise4fun.com/Alive/btZm Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63652 llvm-svn: 364073
* [X86] isBinOp - move commutative ops to isCommutativeBinOp. NFCI.Simon Pilgrim2019-06-211-6/+6
| | | | | | TargetLoweringBase::isBinOp checks isCommutativeBinOp as a fallback, so don't duplicate. llvm-svn: 364072
* [OpenCL][PR41963] Add generic addr space to old atomics in C++ modeAnastasia Stulova2019-06-212-0/+53
| | | | | | | | | Add overloads with generic address space pointer to old atomics. This is currently only added for C++ compilation mode. Differential Revision: https://reviews.llvm.org/D62335 llvm-svn: 364071
* [asan] Avoid two compiler-synthesized calls to memset & memcpyReid Kleckner2019-06-212-5/+5
| | | | | | | | | | Otherwise the tests hang on Windows attempting to report nested errors. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D63627 llvm-svn: 364070
* [NFC] Added more tests for D63652David Bolvansky2019-06-211-0/+33
| | | | llvm-svn: 364069
* Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-06-211-1/+1
| | | | llvm-svn: 364068
* Print more type node information when dumping the AST to JSON.Aaron Ballman2019-06-213-0/+381
| | | | llvm-svn: 364067
* [clang][NewPM] Add -fno-experimental-new-pass-manager to testsLeonard Chan2019-06-2113-91/+135
| | | | | | | | | | | | | | | | As per the discussion on D58375, we disable test that have optimizations under the new PM. This patch adds -fno-experimental-new-pass-manager to RUNS that: - Already run with optimizations (-O1 or higher) that were missed in D58375. - Explicitly test new PM behavior along side some new PM RUNS, but are missing this flag if new PM is enabled by default. - Specify -O without the number. Based on getOptimizationLevel(), it seems the default is 2, and the IR appears to be the same when changed to -O2, so update the test to explicitly say -O2 and provide -fno-experimental-new-pass-manager`. Differential Revision: https://reviews.llvm.org/D63156 llvm-svn: 364066
* Use rvalue references throughout the is_constructible traits.Eric Fiselier2019-06-214-153/+6
| | | | llvm-svn: 364065
* [InstCombine] cttz(abs(x)) -> cttz(x)David Bolvansky2019-06-212-36/+25
| | | | | | | | | | | | Summary: Signedness does not change number of trailing zeros. Reviewers: spatel, lebedev.ri, nikic Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D63546 llvm-svn: 364064
* Make move and forward work in C++03.Eric Fiselier2019-06-2112-195/+43
| | | | | | | | | | | | | | | | | | | | | These functions are key to allowing the use of rvalues and variadics in C++03 mode. Everything works the same as in C++11, except for one tangentially related case: struct T { T(T &&) = default; }; In C++11, T has a deleted copy constructor. But in C++03 Clang gives it both a move and a copy constructor. This seems reasonable enough given the extensions it's using. The other changes in this patch were the minimal set required to keep the tests passing after the move/forward change. Most notably the removal of the `__rv<unique_ptr>` hack that was present in an attempt to make unique_ptr move only without language support. llvm-svn: 364063
* [GVNSink] prevent crashing on mismatched instructions (PR42346)Sanjay Patel2019-06-212-0/+43
| | | | | | | Patch based on suggestion by James Molloy (@jmolloy) in: https://bugs.llvm.org/show_bug.cgi?id=42346 llvm-svn: 364062
* [OPENMP]Fix PR42159: do not capture threadprivate variables.Alexey Bataev2019-06-212-4/+8
| | | | | | | The threadprivate variables should not be captured in the outlined regions, otherwise it leads to the compiler crash. llvm-svn: 364061
* [NFC] Added tests for (1 << (C - x)) -> ((1 << C) >> x)David Bolvansky2019-06-211-0/+152
| | | | llvm-svn: 364060
* [DAGCombine] narrowInsertExtractVectorBinOp - reuse "extract from insert" ↵Simon Pilgrim2019-06-211-11/+15
| | | | | | | | detection code. Move the "extract from insert detection code" into a lambda helper function. llvm-svn: 364059
* Enable aligned_union in C++03Eric Fiselier2019-06-212-5/+0
| | | | llvm-svn: 364058
* Get is_convertible tests passing in C++03 (except the fallback).Eric Fiselier2019-06-212-8/+3
| | | | llvm-svn: 364057
* [docs][llvm-objdump] Fix bad merge of docsJames Henderson2019-06-211-4/+4
| | | | llvm-svn: 364056
* Add an automated note to files produced by gen_ast_dump_json_test.py.Aaron Ballman2019-06-2121-627/+678
| | | | | | This also details what filters, if any, were used to generate the test output. Updates all the current JSON testing files to include the automated note. llvm-svn: 364055
* Remove dead non-variadic workarounds in <type_traits>Eric Fiselier2019-06-211-414/+3
| | | | | | We can use variadics with clang llvm-svn: 364054
* Make rvalue metaprogramming traits work in C++03.Eric Fiselier2019-06-2110-43/+0
| | | | | | The next step is to get move and forward working in C++03. llvm-svn: 364053
* [llvm-objcopy] - Get rid of dynrel.elf precompiled binary from inputs.George Rimar2019-06-213-20/+64
| | | | | | | | | | We do not have to spread using the precompiled binaries in the tests, when we can use YAML. This patch removes the dynrel.elf binary and adds a few comments to the test cases. Differential revision: https://reviews.llvm.org/D63641 llvm-svn: 364052
* [Scalarizer] Propagate IR flagsJay Foad2019-06-212-4/+59
| | | | | | | | | | | | | | | | | | Summary: The motivation for this was to propagate fast-math flags like nnan and ninf on vector floating point operations to the corresponding scalar operations to take advantage of follow-on optimizations. But I think the same argument applies to all of our IR flags: if they apply to the vector operation then they also apply to all the individual scalar operations, and they might enable follow-on optimizations. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63593 llvm-svn: 364051
* Remove even more dead code.Eric Fiselier2019-06-212-240/+2
| | | | llvm-svn: 364050
* [llvm-readobj] - Inline a few yaml inputs into test cases.George Rimar2019-06-216-294/+293
| | | | | | | | | There are some test that are splitted into main part + input yaml for no visible reason. This patch inines the yaml part for the 3 test cases I found. Differential revision: https://reviews.llvm.org/D63644 llvm-svn: 364049
* Set an explicit x86 triple for test bottleneck-analysis.s added by my ↵Andrea Di Biagio2019-06-211-1/+1
| | | | | | | | r364045. NFC This should unbreak the ppc64 buildbots. llvm-svn: 364048
* Assume __is_final, __is_base_of, and friends.Eric Fiselier2019-06-213-96/+4
| | | | | | | | | | All the compilers we support provide these builtins. We don't need to do a configuration dance anymore. This patch also cleans up some dead or almost dead C++11 feature detection macros. llvm-svn: 364047
* [RISCV] Add RISCV-specific TargetTransformInfoSam Elliott2019-06-219-13/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: LLVM Allows Targets to provide information that guides optimisations made to LLVM IR. This is done with callbacks on a TargetTransformInfo object. This patch adds a TargetTransformInfo class for RISC-V. This will allow us to implement RISC-V specific callbacks as they become necessary. This commit also adds the getIntImmCost callbacks, and tests them with a simple constant hoisting test. Our immediate costs are on the conservative side, for the moment, but we prevent hoisting in most circumstances anyway. Previous review was on D63007 Reviewers: asb, luismarques Reviewed By: asb Subscribers: ributzka, MaskRay, llvm-commits, Jim, benna, psnobl, jocewei, PkmX, rkruppe, the_o, brucehoult, MartinMosbeck, rogfer01, edward-jones, zzheng, jrtc27, shiva0217, kito-cheng, niosHD, sabuasal, apazos, simoncook, johnrusso, rbar, hiraditya, mgorny Tags: #llvm Differential Revision: https://reviews.llvm.org/D63433 llvm-svn: 364046
* [MCA][Bottleneck Analysis] Teach how to compute a critical sequence of ↵Andrea Di Biagio2019-06-2110-39/+676
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | instructions based on the simulation. This patch teaches the bottleneck analysis how to identify and print the most expensive sequence of instructions according to the simulation. Fixes PR37494. The goal is to help users identify the sequence of instruction which is most critical for performance. A dependency graph is internally used by the bottleneck analysis to describe data dependencies and processor resource interferences between instructions. There is one node in the graph for every instruction in the input assembly sequence. The number of nodes in the graph is independent from the number of iterations simulated by the tool. It means that a single node of the graph represents all the possible instances of a same instruction contributed by the simulated iterations. Edges are dynamically "discovered" by the bottleneck analysis by observing instruction state transitions and "backend pressure increase" events generated by the Execute stage. Information from the events is used to identify critical dependencies, and materialize edges in the graph. A dependency edge is uniquely identified by a pair of node identifiers plus an instance of struct DependencyEdge::Dependency (which provides more details about the actual dependency kind). The bottleneck analysis internally ranks dependency edges based on their impact on the runtime (see field DependencyEdge::Dependency::Cost). To this end, each edge of the graph has an associated cost. By default, the cost of an edge is a function of its latency (in cycles). In practice, the cost of an edge is also a function of the number of cycles where the dependency has been seen as 'contributing to backend pressure increases'. The idea is that the higher the cost of an edge, the higher is the impact of the dependency on performance. To put it in another way, the cost of an edge is a measure of criticality for performance. Note how a same edge may be found in multiple iteration of the simulated loop. The logic that adds new edges to the graph checks if an equivalent dependency already exists (duplicate edges are not allowed). If an equivalent dependency edge is found, field DependencyEdge::Frequency of that edge is incremented by one, and the new cost is cumulatively added to the existing edge cost. At the end of simulation, costs are propagated to nodes through the edges of the graph. The goal is to identify a critical sequence from a node of the root-set (composed by node of the graph with no predecessors) to a 'sink node' with no successors. Note that the graph is intentionally kept acyclic to minimize the complexity of the critical sequence computation algorithm (complexity is currently linear in the number of nodes in the graph). The critical path is finally computed as a sequence of dependency edges. For edges describing processor resource interferences, the view also prints a so-called "interference probability" value (by dividing field DependencyEdge::Frequency by the total number of iterations). Examples of critical sequence computations can be found in tests added/modified by this patch. On output streams that support colored output, instructions from the critical sequence are rendered with a different color. Strictly speaking the analysis conducted by the bottleneck analysis view is not a critical path analysis. The cost of an edge doesn't only depend on the dependency latency. More importantly, the cost of a same edge may be computed differently by different iterations. The number of dependencies is discovered dynamically based on the events generated by the simulator. However, their number is not fixed. This is especially true for edges that model processor resource interferences; an interference may not occur in every iteration. For that reason, it makes sense to also print out a "probability of interference". By construction, the accuracy of this analysis (as always) is strongly dependent on the simulation (and therefore the quality of the information available in the scheduling model). That being said, the critical sequence effectively identifies a performance criticality. Instructions from that sequence are expected to have a very big impact on performance. So, users can take advantage of this information to focus their attention on specific interactions between instructions. In my experience, it works quite well in practice, and produces useful output (in a reasonable amount time). Differential Revision: https://reviews.llvm.org/D63543 llvm-svn: 364045
* [clangd] Add include-mapping for C symbols.Haojian Wu2019-06-2111-192/+1198
| | | | | | | | | | | | | | | | | | | Summary: This resolves the issue of introducing c++-style includes for C files. - refactor the gen_std.py, make it reusable for parsing C symbols. - add a language mode to the mapping method to use different mapping for C and C++ files. Reviewers: kadircet Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, jfb, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63270 llvm-svn: 364044
OpenPOWER on IntegriCloud