bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[ARM][MVE][Intrinsics] Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics. Add unit ↵	Mark Murray	2019-11-27	5	-0/+364
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests. Summary: Add MVE VAND/VORR/VORN/VEOR/VBIC intrinsics. Add unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70547
*	[ARM][MVE][Intrinsics] Add MVE VMUL intrinsics. Remove annoying "t1" from ↵	Mark Murray	2019-11-27	1	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	VMUL* instructions. Add unit tests. Summary: Add MVE VMUL intrinsics. Remove annoying "t1" from VMUL* instructions. Add unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70546
*	[ARM][MVE][Intrinsics] Add MVE VABD intrinsics. Add unit tests.	Mark Murray	2019-11-27	1	-0/+95
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add MVE VABD intrinsics. Add unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70545
*	Revert "[clang][CodeGen] Implicit Conversion Sanitizer: handle ↵	Roman Lebedev	2019-11-27	9	-1545/+0
\| \| \| \| \| \| \| \| \| \|	increment/decrement (PR44054)" The asssertion that was added does not hold, breaks on test-suite/MultiSource/Applications/SPASS/analyze.c Will reduce the testcase and revisit. This reverts commit 9872ea4ed1de4c49300430e4f1f4dfc110a79ab9, 870f3542d3e0d06d208442bdca6482866b59171b.
*	[ARM] Replace arm_neon_vqadds with sadd_sat	David Green	2019-11-27	2	-60/+60
\| \| \| \| \| \| \| \| \| \|	This replaces the A32 NEON vqadds, vqaddu, vqsubs and vqsubu intrinsics with the target independent sadd_sat, uadd_sat, ssub_sat and usub_sat. This helps generate vqadds from standard IR nodes, which might be produced from the vectoriser. The old variants are removed in the process. Differential Revision: https://reviews.llvm.org/D69350
*	[CodeGen][UBSan] Relax newly-added verbose sanitization tests for inc/dec	Roman Lebedev	2019-11-27	2	-32/+32
\| \| \| \| \|	In particular, don't hardcode the signature of the handler: it takes src filepath so the length of buffers will not match,
*	[clang][CodeGen] Implicit Conversion Sanitizer: handle increment/decrement ↵	Roman Lebedev	2019-11-27	9	-0/+1545
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR44054) Summary: Implicit Conversion Sanitizer is almost feature complete. There aren't that much unsanitized things left, two major ones are increment/decrement (this patch) and bit fields. As it was discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=39519 \| PR39519 ]], unlike `CompoundAssignOperator` (which is promoted internally), or `BinaryOperator` (for which we always have promotion/demotion in AST) or parts of `UnaryOperator` (we have promotion/demotion but only for certain operations), for inc/dec, clang omits promotion/demotion altogether, under as-if rule. This is technically correct: https://rise4fun.com/Alive/zPgD As it can be seen in `InstCombineCasts.cpp` `canEvaluateTruncated()`, `add`/`sub`/`mul`/`and`/`or`/`xor` operators can all arbitrarily be extended or truncated: https://github.com/llvm/llvm-project/blob/901cd3b3f62d0c700e5d2c3f97eff97d634bec5e/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp#L1320-L1334 But that has serious implications: 1. Since we no longer model implicit casts, do we pessimise their AST representation and everything that uses it? 2. There is no demotion, so lossy demotion sanitizer does not trigger :] Now, i'm not going to argue about the first problem here, but the second one needs to be addressed. As it was stated in the report, this is done intentionally, so changing this in all modes would be considered a penalization/regression. Which means, the sanitization-less codegen must not be altered. It was also suggested to not change the sanitized codegen to the one with demotion, but i quite strongly believe that will not be the wise choice here: 1. One will need to re-engineer the check that the inc/dec was lossy in terms of `@llvm.{u,s}{add,sub}.with.overflow` builtins 2. We will still need to compute the result we would lossily demote. (i.e. the result of wide `add`ition/`sub`traction) 3. I suspect it would need to be done right here, in sanitization. Which kinda defeats the point of using `@llvm.{u,s}{add,sub}.with.overflow` builtins: we'd have two `add`s with basically the same arguments, one of which is used for check+error-less codepath and other one for the error reporting. That seems worse than a single wide op+check. 4. OR, we would need to do that in the compiler-rt handler. Which means we'll need a whole new handler. But then what about the `CompoundAssignOperator`, it would also be applicable for it. So this also doesn't really seem like the right path to me. 5. At least X86 (but likely others) pessimizes all sub-`i32` operations (due to partial register stalls), so even if we avoid promotion+demotion, the computations will //likely// be performed in `i32` anyways. So i'm not really seeing much benefit of not doing the straight-forward thing. While looking into this, i have noticed a few more LLVM middle-end missed canonicalizations, and filed [[ https://bugs.llvm.org/show_bug.cgi?id=44100 \| PR44100 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=44102 \| PR44102 ]]. Those are not specific to inc/dec, we also have them for `CompoundAssignOperator`, and it can happen for normal arithmetics, too. But if we take some other path in the patch, it will not be applicable here, and we will have most likely played ourselves. TLDR: front-end should emit canonical, easy-to-optimize yet un-optimized code. It is middle-end's job to make it optimal. I'm really hoping reviewers agree with my personal assessment of the path this patch should take.. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44054 \| PR44054 ]]. Reviewers: rjmccall, erichkeane, rsmith, vsk Reviewed By: erichkeane Subscribers: mehdi_amini, dexonsmith, cfe-commits, #sanitizers, llvm-commits, aaron.ballman, t.p.northover, efriedma, regehr Tags: #llvm, #clang, #sanitizers Differential Revision: https://reviews.llvm.org/D70539
*	Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first ↵	Eric Christopher	2019-11-26	4	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pass at the O1 described there."" This reapplies: 8ff85ed905a7306977d07a5cd67ab4d5a56fafb4 Original commit message: As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410 This reverts commit c9ddb02659e3ece7a0d9d6b4dac7ceea4ae46e6d.
*	Fix tests on Windows after D49466	Fangrui Song	2019-11-26	1	-6/+6
\| \| \| \| \| \|	It is tricky to use replace_path_prefix correctly on Windows which uses backslashes as native path separators. Switch back to the old approach (startswith is not ideal) to appease build bots for now.
*	Initial implementation of -fmacro-prefix-map and -ffile-prefix-map	Dan McGregor	2019-11-26	1	-0/+2
\| \| \| \| \| \| \| \| \|	GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the __FILE__ macro. -ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map Reviewed By: rnk, Lekensteyn, maskray Differential Revision: https://reviews.llvm.org/D49466
*	Recommit ARM-NEON: make type modifiers orthogonal and allow multiple modifiers.	Tim Northover	2019-11-26	2	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The modifier system used to mutate types on NEON intrinsic definitions had a separate letter for all kinds of transformations that might be needed, and we were quite quickly running out of letters to use. This patch converts to a much smaller set of orthogonal modifiers that can be applied together to achieve the desired effect. When merging with downstream it is likely to cause a conflict with any local modifications to the .td files. There is a new script in utils/convert_arm_neon.py that was used to convert all .td definitions and I would suggest running it on the last downstream version of those files before this commit rather than resolving conflicts manually. The original version broke vcreate_* because it became a macro and didn't apply the normal integer promotion rules before bitcasting to a vector. This adds a temporary.
*	Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at ↵	Muhammad Omair Javaid	2019-11-26	4	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the O1 described there." This reverts commit 8ff85ed905a7306977d07a5cd67ab4d5a56fafb4. This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu Following tests were failing: lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410
*	As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 ↵	Eric Christopher	2019-11-25	4	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410
*	IRGen: Call SetLLVMFunctionAttributes{,ForDefinition} on __cfi_check_fail.	Peter Collingbourne	2019-11-25	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has the main effect of causing target-cpu and target-features to be set on __cfi_check_fail, causing the function to become ABI-compatible with other functions in the case where these attributes affect ABI (e.g. reserve-x18). Technically we only need to call SetLLVMFunctionAttributes to get the target-* attributes set, but since we're creating a definition we probably ought to call the ForDefinition function as well. Fixes PR44094. Differential Revision: https://reviews.llvm.org/D70692
*	Revert 3f91705ca54 "ARM-NEON: make type modifiers orthogonal and allow ↵	Hans Wennborg	2019-11-25	2	-6/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	multiple modifiers." This broke the vcreate_u64 intrinsic. Example: $ cat /tmp/a.cc #include <arm_neon.h> void g() { auto v = vcreate_u64(0); } $ bin/clang -c /tmp/a.cc --target=arm-linux-androideabi16 -march=armv7-a /tmp/a.cc:4:12: error: C-style cast from scalar 'int' to vector 'uint64x1_t' (vector of 1 'uint64_t' value) of different size auto v = vcreate_u64(0); ^~~~~~~~~~~~~~ /work/llvm.monorepo/build.release/lib/clang/10.0.0/include/arm_neon.h:4144:11: note: expanded from macro 'vcreate_u64' __ret = (uint64x1_t)(__p0); \ ^~~~~~~~~~~~~~~~~~ Reverting until this can be investigated. > The modifier system used to mutate types on NEON intrinsic definitions had a > separate letter for all kinds of transformations that might be needed, and we > were quite quickly running out of letters to use. This patch converts to a much > smaller set of orthogonal modifiers that can be applied together to achieve the > desired effect. > > When merging with downstream it is likely to cause a conflict with any local > modifications to the .td files. There is a new script in > utils/convert_arm_neon.py that was used to convert all .td definitions and I > would suggest running it on the last downstream version of those files before > this commit rather than resolving conflicts manually.
*	DebugInfo: Flag Dwarf Version metadata for merging during LTO	David Blaikie	2019-11-22	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the Dwarf Version metadata was initially added (r184276) there was no support for Module::Max - though the comment suggested that was the desired behavior. The original behavior was Module::Warn which would warn and then pick whichever version came first - which is pretty arbitrary/luck-based if the consumer has some need for one version or the other. Now that the functionality's been added (r303590) this change updates the implementation to match the desired goal. The general logic here is - if you compile /some/ of your program with a more recent DWARF version, you must have a consumer that can handle it, so might as well use it for /everything/. The only place where this might fall down is if you have a need to use an old tool (supporting only the older DWARF version) for some subset of your program. In which case now it'll all be the higher version. That seems pretty narrow (& the inverse could happen too - you specifically /need/ the higher DWARF version for some extra expressivity, etc, in some part of the program)
*	Atomics: support min/max orthogonally	Tim Northover	2019-11-21	1	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \|	We seem to have been gradually growing support for atomic min/max operations (exposing longstanding IR atomicrmw instructions). But until now there have been gaps in the expected intrinsics. This adds support for the C11-style intrinsics (i.e. taking _Atomic, rather than individually blessed by C11 standard), and the variants that return the new value instead of the original one. That way, people won't be misled by trying one form and it not working, and the front-end is more friendly to people using _Atomic types, as we recommend.
*	ARM-NEON: make type modifiers orthogonal and allow multiple modifiers.	Tim Northover	2019-11-20	2	-14/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The modifier system used to mutate types on NEON intrinsic definitions had a separate letter for all kinds of transformations that might be needed, and we were quite quickly running out of letters to use. This patch converts to a much smaller set of orthogonal modifiers that can be applied together to achieve the desired effect. When merging with downstream it is likely to cause a conflict with any local modifications to the .td files. There is a new script in utils/convert_arm_neon.py that was used to convert all .td definitions and I would suggest running it on the last downstream version of those files before this commit rather than resolving conflicts manually.
*	Update tests after change to llvm-cxxfilt's underscore stripping behaviour.	Tim Northover	2019-11-20	6	-14/+14
\|
*	Reland "[clang] Remove the DIFlagArgumentNotModified debug info flag"	Djordje Todorovic	2019-11-20	1	-25/+0
\| \| \| \| \| \| \| \| \| \|	It turns out that the ExprMutationAnalyzer can be very slow when AST gets huge in some cases. The idea is to move this analysis to the LLVM back-end level (more precisely, in the LiveDebugValues pass). The new approach will remove the performance regression, simplify the implementation and give us front-end independent implementation. Differential Revision: https://reviews.llvm.org/D68206
*	[CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood ↵	Vedant Kumar	2019-11-19	1	-7/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(reland with fixes) Currently, clang emits subprograms for declared functions when the target debugger or DWARF standard is known to support entry values (DW_OP_entry_value & the GNU equivalent). Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU tail calls. Pre-patch debug session with a cross-TU tail call: ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` Post-patch (note that the tail-calling frame, "helper", is visible): ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f80 main`helper [opt] [artificial] frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` This was reverted in 5b9a072c because it attached declaration subprograms to inlinable builtin calls, which interacted badly with the MergeICmps pass. The fix is to not attach declarations to builtins. rdar://46577651 Differential Revision: https://reviews.llvm.org/D69743
*	clang: Add -fconvergent-functions flag	Matt Arsenault	2019-11-19	1	-0/+8
\| \| \| \| \| \| \| \|	The CUDA builtin library is apparently compiled in C++ mode, so the assumption of convergent needs to be made in a typically non-SPMD language. The functions in the library should still be assumed convergent. Currently they are not, which is potentially incorrect and this happens to work after the library is linked.
*	[ARM,MVE] Add intrinsics for scalar shifts.	Simon Tatham	2019-11-19	1	-0/+231
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fills in the small family of MVE intrinsics that have nothing to do with vectors: they implement bit-shift operations on 32- or 64-bit values held in one or two general-purpose registers. Most of these shift operations saturate if shifting left, and round to nearest if shifting right, although LSLL and ASRL behave like ordinary shifts. When these instructions take a variable shift count in a register, they pay attention to its sign, so that (for example) LSLL or UQRSHLL will shift left if given a positive number but right if given a negative one. That makes even LSLL and ASRL different enough from standard LLVM IR shift semantics that I couldn't see any better alternative than to simply model the whole family as a set of MVE-specific IR intrinsics. (The //immediate// forms of LSLL and ASRL, on the other hand, do behave exactly like a standard IR shift of a 64-bit value. In fact, those forms don't have ACLE intrinsics defined at all, because you can just write an ordinary C shift operation if you want one of those.) The 64-bit shifts have to be instruction-selected in C++, because they deliver two output values. But the 32-bit ones are simple enough that I could write a DAG isel pattern directly into each Instruction record. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70319
*	Temporarily Revert "Add support for options -frounding-math, ftrapping-math, ↵	Eric Christopher	2019-11-18	1	-23/+0
\| \| \| \| \| \| \| \|	-ffp-model=, and -ffp-exception-behavior=" and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread. This reverts commits af57dbf12e54f3a8ff48534bf1078f4de104c1cd and e6584b2b7b2de06f1e59aac41971760cac1e1b79
*	[ARM,MVE] Add intrinsics for vector comparisons.	Simon Tatham	2019-11-18	1	-0/+3150
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the `vcmp` family of ACLE MVE intrinsics: vector/vector, vector/scalar, and the predicated forms of both. All are represented using standard existing IR: vector/scalar comparisons are represented by making a vector out of the scalar first, and predicated forms are represented by taking the bitwise AND of the input predicate and the output of the comparison. Existing LLVM-side tests demonstrate that ISel will pattern-match all of that back down to single MVE VCMPs. The idiom of handling a vector/scalar operation by generating IR to expand the scalar into a second vector is going to be needed for a lot of MVE intrinsics, so to make that easy, I've provided a helper function that automatically works out the element count. The comparison intrinsics are the first ones that have to //return// a predicate, in the user-facing `mve_pred16_t` format. This means we have to use the `arm_mve_pred_v2i` low-level intrinsic to convert it back from the logical `<n x i1>` form used in IR. I've done that explicitly in the code gen specification for the builtins, because it happens much more rarely in the ACLE API than passing a Predicate as input, so it didn't seem worth automating in MveEmitter. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70297
*	Implement target(branch-protection) attribute for AArch64	Momchil Velikov	2019-11-15	1	-0/+81
\| \| \| \| \| \| \| \| \|	This patch implements `__attribute__((target("branch-protection=...")))` in a manner, compatible with the analogous GCC feature: https://gcc.gnu.org/onlinedocs/gcc-9.2.0/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes Differential Revision: https://reviews.llvm.org/D68711
*	Revert "[clang] Remove the DIFlagArgumentNotModified debug info flag"	Djordje Todorovic	2019-11-15	1	-0/+25
\| \| \| \|	This reverts commit rG1643734741d2 due to LLDB test failure.
*	[clang] Remove the DIFlagArgumentNotModified debug info flag	Djordje Todorovic	2019-11-15	1	-25/+0
\| \| \| \| \| \| \| \| \| \|	It turns out that the ExprMutationAnalyzer can be very slow when AST gets huge in some cases. The idea is to move this analysis to the LLVM back-end level (more precisely, in the LiveDebugValues pass). The new approach will remove the performance regression, simplify the implementation and give us front-end independent implementation. Differential Revision: https://reviews.llvm.org/D68206
*	[ARM,MVE] Add intrinsics for vector get/set lane.	Simon Tatham	2019-11-15	10	-15/+306
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the `vgetq_lane` and `vsetq_lane` families, to copy between a scalar and a specified lane of a vector. One of the new `vgetq_lane` intrinsics returns a `float16_t`, which causes a compile error if `%clang_cc1` doesn't get the option `-fallow-half-arguments-and-returns`. The driver passes that option to cc1 already, but I've had to edit all the explicit cc1 command lines in the existing MVE intrinsics tests. A couple of fixes are included for the code I wrote up front in MveEmitter to support lane-index immediates (and which nothing has tested until now): the type was wrong (`uint32_t` instead of `int`) and the range was off by one. I've also added a method of bypassing the default promotion to `i32` that is done by the MveEmitter code generation: it's sensible to promote short scalars like `i16` to `i32` if they're going to be passed to custom IR intrinsics representing a machine instruction operating on GPRs, but not if they're going to be passed to standard IR operations like `insertelement` which expect the exact type. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70188
*	[ARM,MVE] Add intrinsics for 'administrative' vector operations.	Simon Tatham	2019-11-15	1	-0/+1556
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This batch of intrinsics includes lots of things that move vector data around or change its type without really affecting its value very much. It includes the `vreinterpretq` family (cast one vector type to another); `vuninitializedq` (create a vector of a given type with don't-care contents); and `vcreateq` (make a 128-bit vector out of two `uint64_t` halves). These are all implemented using completely standard IR that's already tested in existing LLVM unit tests, so I've just written a clang test to check the IR is correct, and left it at that. I've also added some richer infrastructure to the MveEmitter Tablegen backend, to make it specify the exact integer type of integer arguments passed to IR construction functions, and wrap those arguments in a `static_cast` in the autogenerated C++. That was necessary to prevent an overloading ambiguity when passing the integer literal `0` to `IRBuilder::CreateInsertElement`, because otherwise, it could mean either a null pointer `llvm::Value *` or a zero `uint64_t`. Reviewers: ostannard, MarkMurrayARM, dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D70133
*	[X86] Don't treat mxcsr as a register name when parsing MS inline assembly.	Craig Topper	2019-11-13	1	-1/+9
\| \| \| \| \|	No instruction takes mxcsr as a an operand so we should always treat it as an identifier name.
*	[BPF] fix clang test failure for bpf-attr-preserve-access-index-4.c	Yonghong Song	2019-11-13	1	-3/+3
\| \| \| \| \| \| \|	Depending on different cmake configures, clang may generate different IR name for slot variables. Let us use the regex instead of hard coding the name. I did the same for other bpf-attr-preserve-access-index tests with such an approach, but somehow did not do for this one.
*	[BPF] Add preserve_access_index attribute for record definition	Yonghong Song	2019-11-13	8	-0/+248
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a resubmission for the previous reverted commit 943436040121 with the same subject. This commit fixed the segfault issue and addressed additional review comments. This patch introduced a new bpf specific attribute which can be added to struct or union definition. For example, struct s { ... } __attribute__((preserve_access_index)); union u { ... } __attribute__((preserve_access_index)); The goal is to simplify user codes for cases where preserve access index happens for certain struct/union, so user does not need to use clang __builtin_preserve_access_index for every members. The attribute has no effect if -g is not specified. When the attribute is specified and -g is specified, any member access defined by that structure or union, including array subscript access and inner records, will be preserved through __builtin_preserve_{array,struct,union}_access_index() IR intrinsics, which will enable relocation generation in bpf backend. The following is an example to illustrate the usage: -bash-4.4$ cat t.c #define __reloc__ __attribute__((preserve_access_index)) struct s1 { int c; } __reloc__; struct s2 { union { struct s1 b[3]; }; } __reloc__; struct s3 { struct s2 a; } __reloc__; int test(struct s3 *arg) { return arg->a.b[2].c; } -bash-4.4$ clang -target bpf -g -S -O2 t.c A relocation with access string "0:0:0:0:2:0" will be generated representing access offset of arg->a.b[2].c. forward declaration with attribute is also handled properly such that the attribute is copied and populated in real record definition. Differential Revision: https://reviews.llvm.org/D69759
*	[ARM,MVE] Add intrinsics for contiguous load/stores.	Simon Tatham	2019-11-13	1	-0/+1325
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the ACLE intrinsics for all the MVE load and store instructions not already handled by D69791. These ones don't need new IR intrinsics, because they can be implemented in terms of standard LLVM IR constructions. Some of the load and store instructions access less than 128 bits of memory, sign/zero extending each value to a wider vector lane on load or truncating it on store. These are represented in IR by a load of a shorter vector followed by a zext/sext, and conversely, a trunc followed by a short store. Existing ISel patterns already recognize those combinations and turn them into the right MVE instructions. The predicated forms of all these instructions are represented in the same way, except that the ordinary load/store operation is replaced with the existing intrinsics @llvm.masked.{load,store}. These are currently only code-generated as predicated MVE load/store instructions if you give LLVM the `-enable-arm-maskedldst` option; so I've done that in the LLVM codegen test. When we make that the default, that option can be removed. In the Tablegen backend, I've had to add a handful of extra support features: * We need to be able to make clang::Address objects out of a pointer and an alignment (previously we only needed these when the user passed us an existing one). * We can now specify vector types that aren't 128 bits wide (for use in those intermediate values in IR), the parametrized type system can make one starting from two existing vector types (using the lane count of one and the element type of the other). * I've added support for code generation of pointer casts, and for specifying LLVM types as operands to IRBuilder operations (for zext and sext, though I think they'll come in useful again). * Now not all IR construction operations need to be specified as Builder.CreateFoo; some don't involve a Builder at all, and one passes it as a parameter to a tiny static helper function in CGBuiltin.cpp. Reviewers: ostannard, MarkMurrayARM, dmgreen Subscribers: kristof.beyls, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70088
*	AArch64: add arm64_32 support to Clang.	Tim Northover	2019-11-12	4	-1/+153
\|
*	Revert "[BPF] Add preserve_access_index attribute for record definition"	Yonghong Song	2019-11-09	8	-248/+0
\| \| \| \| \| \|	This reverts commit 4a5aa1a7bf8b1714b817ede8e09cd28c0784228a. There are some other test failures. Investigate them first.
*	[BPF] Add preserve_access_index attribute for record definition	Yonghong Song	2019-11-09	8	-0/+248
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch introduced a new bpf specific attribute which can be added to struct or union definition. For example, struct s { ... } __attribute__((preserve_access_index)); union u { ... } __attribute__((preserve_access_index)); The goal is to simplify user codes for cases where preserve access index happens for certain struct/union, so user does not need to use clang __builtin_preserve_access_index for every members. The attribute has no effect if -g is not specified. When the attribute is specified and -g is specified, any member access defined by that structure or union, including array subscript access and inner records, will be preserved through __builtin_preserve_{array,struct,union}_access_index() IR intrinsics, which will enable relocation generation in bpf backend. The following is an example to illustrate the usage: -bash-4.4$ cat t.c #define __reloc__ __attribute__((preserve_access_index)) struct s1 { int c; } __reloc__; struct s2 { union { struct s1 b[3]; }; } __reloc__; struct s3 { struct s2 a; } __reloc__; int test(struct s3 *arg) { return arg->a.b[2].c; } -bash-4.4$ clang -target bpf -g -S -O2 t.c A relocation with access string "0:0:0:0:2:0" will be generated representing access offset of arg->a.b[2].c. forward declaration with attribute is also handled properly such that the attribute is copied and populated in real record definition. Differential Revision: https://reviews.llvm.org/D69759
*	[clang] Add VFS support for sanitizers' blacklists	Jan Korous	2019-11-08	2	-0/+51
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D69648
*	[PowerPC][Altivec] Fix offsets for vec_xl and vec_xst	Nemanja Ivanovic	2019-11-07	1	-0/+490
\| \| \| \| \| \| \| \|	As we currently have it implemented in altivec.h, the offsets for these two intrinsics are element offsets. The documentation in the ABI (as well as the implementation in both XL and GCC) states that these should be byte offsets. Differential revision: https://reviews.llvm.org/D63636
*	[PowerPC][Altivec] Emit correct builtin for single precision vec_all_ne	Nemanja Ivanovic	2019-11-07	1	-0/+57
\| \| \| \| \| \| \|	We currently emit a double precision comparison instruction for this, whereas we need to emit the single precision version. Differential revision: https://reviews.llvm.org/D64024
*	Add support for options -frounding-math, ftrapping-math, -ffp-model=, and ↵	Melanie Blower	2019-11-07	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \|	-ffp-exception-behavior= Add options to control floating point behavior: trapping and exception behavior, rounding, and control of optimizations that affect floating point calculations. More details in UsersManual.rst. Reviewers: rjmccall Differential Revision: https://reviews.llvm.org/D62731
*	CodeGen: set correct result for atomic compound expressions	Tim Northover	2019-11-07	1	-0/+53
\| \| \| \| \| \| \| \|	Atomic compound expressions try to use atomicrmw if possible, but this path doesn't set the Result variable, leaving it to crash in later code if anything ever tries to use the result of the expression. This fixes that issue by recalculating the new value based on the old one atomically loaded.
*	Revert a5c8ec4 "[CGDebugInfo] Emit subprograms for decls when AT_tail_call ↵	Hans Wennborg	2019-11-07	1	-17/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	is understood" This caused Chromium builds to fail with "inlinable function call in a function with debug info must have a !dbg location" errors. See https://bugs.chromium.org/p/chromium/issues/detail?id=1022296#c1 for a reproducer. > Currently, clang emits subprograms for declared functions when the > target debugger or DWARF standard is known to support entry values > (DW_OP_entry_value & the GNU equivalent). > > Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU > tail calls. > > Pre-patch debug session with a cross-TU tail call: > > ``` > * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] > frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt] > ``` > > Post-patch (note that the tail-calling frame, "helper", is visible): > > ``` > * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] > frame #1: 0x0000000100000f80 main`helper [opt] [artificial] > frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt] > ``` > > rdar://46577651 > > Differential Revision: https://reviews.llvm.org/D69743
*	NeonEmitter: remove special 'a' type modifier.	Tim Northover	2019-11-06	2	-58/+58
\| \| \| \| \| \| \| \|	'a' used to implement a splat in C++ code in NeonEmitter.cpp, but this can be done directly from .td expansions now (and most ops already did). So removing it simplifies the overall code. https://reviews.llvm.org/D69716
*	[ARM,MVE] Add intrinsics for gather/scatter load/stores.	Simon Tatham	2019-11-06	1	-0/+2146
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds two new families of intrinsics, both of which are memory accesses taking a vector of locations to load from / store to. The vldrq_gather_base / vstrq_scatter_base intrinsics take a vector of base addresses, and an immediate offset to be added consistently to each one. vldrq_gather_offset / vstrq_scatter_offset take a scalar base address, and a vector of offsets to add to it. The 'shifted_offset' variants also multiply each offset by the element size type, so that the vector is effectively of array indices. At the IR level, these operations are represented by a single set of four IR intrinsics: {gather,scatter} × {base,offset}. The other details (signed/unsigned, shift, and memory element size as opposed to vector element size) are all specified by IR intrinsic polymorphism and immediate operands, because that made the selection job easier than making a huge family of similarly named intrinsics. I considered using the standard IR representations such as llvm.masked.gather, but they're not a good fit. In order to use llvm.masked.gather to represent a gather_offset load with element size smaller than a pointer, you'd have to expand the <8 x i16> vector of offsets into an <8 x i16*> vector of pointers, which would be split up during legalization, so you'd spend most of your time undoing the mess it had made. Also, ISel support for llvm.masked.gather would be easy enough in a trivial way (you can expand it into a gather-base load with a zero immediate offset), but instruction-selecting lots of fiddly idioms back into all the _other_ MVE load instructions would be much more work. So I think dedicated IR intrinsics are the more sensible approach, at least for the moment. On the clang tablegen side, I've added two new features to the Tablegen source accepted by MveEmitter: a 'CopyKind' type node for defining a type that varies with the parameter type (it lets you ask for an unsigned integer type of the same width as the parameter), and an 'unsignedflag' value node for passing an immediate IR operand which is 0 for a signed integer type or 1 for an unsigned one. That lets me write each kind of intrinsic just once and get all its subtypes and immediate arguments generated automatically. Also I've tweaked the handling of pointer-typed values in the code generation part of MveEmitter: they're generated as Address rather than Value (i.e. including an alignment) so that they can be given to the ordinary IR load and store operations, but I'd omitted the code to convert them back to Value when they're going to be used as an argument to an IR intrinsic. On the MC side, I've enhanced MVEVectorVTInfo so that it can tell you not only the full assembly-language suffix for a given vector type (like 's32' or 'u16') but also the numeric-only one used by store instructions (just '32' or '16'). Reviewers: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69791
*	Fix typo so that '-O0' is correctly specified	Bill Wendling	2019-11-05	1	-3/+3
\|
*	[Clang FE] Recognize -mnop-mcount CL option (SystemZ only).	Jonas Paulsson	2019-11-05	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recognize -mnop-mcount from the command line and add a function attribute "mnop-mcount"="true" when passed. When this option is used, a nop is added instead of a call to fentry. This is used when building the Linux Kernel. If this option is passed for any other target than SystemZ, an error is generated. Review: Ulrich Weigand https://reviews.llvm.org/D67763
*	[CGDebugInfo] Emit subprograms for decls when AT_tail_call is understood	Vedant Kumar	2019-11-04	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, clang emits subprograms for declared functions when the target debugger or DWARF standard is known to support entry values (DW_OP_entry_value & the GNU equivalent). Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU tail calls. Pre-patch debug session with a cross-TU tail call: ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` Post-patch (note that the tail-calling frame, "helper", is visible): ``` * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt] frame #1: 0x0000000100000f80 main`helper [opt] [artificial] frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt] ``` rdar://46577651 Differential Revision: https://reviews.llvm.org/D69743
*	Recommit "[CodeView] Add option to disable inline line tables."	Amy Huang	2019-11-04	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 004ed2b0d1b86d424643ffc88fce20ad8bab6804. Original commit hash 6d03890384517919a3ba7fe4c35535425f278f89 Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. https://reviews.llvm.org/D67723
*	[X86] add mayRaiseFPException flag and FPCW registers for X87 instructions	Pengfei Wang	2019-11-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds flag "mayRaiseFPException" , FPCW and FPSW for X87 instructions which could raise float exception. Reviewers: pengfei, RKSimon, andrew.w.kaylor, uweigand, kpn, spatel, cameron.mcinally, craig.topper Reviewed By: craig.topper Subscribers: thakis, hiraditya, llvm-commits Patch by LiuChen. Differential Revision: https://reviews.llvm.org/D68854