bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86] ABI compat bugfix for MSVC vectorcall	Reid Kleckner	2020-01-14	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Before this change, X86_32ABIInfo::classifyArgument would be called twice on vector arguments to vectorcall functions. This function has side effects to track GPR register usage, and this would lead to incorrect GPR usage in some cases. The specific case I noticed is from running out of XMM registers with mixed FP and vector arguments and no aggregates of any kind. Consider this prototype: void __vectorcall vectorcall_indirect_vec( double xmm0, double xmm1, double xmm2, double xmm3, double xmm4, __m128 xmm5, __m128 ecx, int edx, __m128 mem); classifyArgument has no effects when called on a plain FP type, but when called on a vector type, it modifies FreeRegs to model GPR consumption. However, this should not happen during the vector call first pass. I refactored the code to unify vectorcall HVA logic with regcall HVA logic. The conventions pass HVAs in registers differently (expanded vs. not expanded), but if they do not fit in registers, they both pass them indirectly by address. Reviewers: erichkeane, craig.topper Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72110
*	Fix windows bot failures in c410adb092c9cb51ddb0b55862b70f2aa8c5b16f	Rong Xu	2020-01-14	1	-1/+1
\| \| \| \|	(clang diagnostic handler for IR input files)
*	[remark][diagnostics] Using clang diagnostic handler for IR input files	Rong Xu	2020-01-14	4	-1/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For IR input files, we currently use LLVM diagnostic handler even the compilation is from clang. As a result, we are not able to use -Rpass to get the transformation reports. Some warnings are not handled properly either: We found many mysterious warnings in our ThinLTO backend compilations in SamplePGO and CSPGO. An example of the warning: "warning: net/proto2/public/metadata_lite.h:51:21: 0.02% (1 / 4999)" This turns out to be a warning by Wmisexpect, which is supposed to be filtered out by default. But since the filter is in clang's diagnostic hander, we emit these incomplete warnings from LLVM's diagnostic handler. This patch uses clang diagnostic handler for IR input files. We create a fake backendconsumer just to install the diagnostic handler. With this change, we will have proper handling of all the warnings and we can use -Rpass* options in IR input files compilation. Also note that with is patch, LLVM's diagnostic options, like "-mllvm -pass-remarks=*", are no longer be able to get optimization remarks. Differential Revision: https://reviews.llvm.org/D72523
*	[RISCV] Fix ILP32D lowering for double+double/double+int return types	James Clarke	2020-01-14	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously, since these aggregates are > 2*XLen, Clang would think they were being returned indirectly and thus would decrease the number of available GPRs available by 1. For long argument lists this could lead to a struct argument incorrectly being passed indirectly. Reviewers: asb, lenary Reviewed By: asb, lenary Subscribers: luismarques, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69590
*	Revert "[ThinLTO] Add additional ThinLTO pipeline testing with new PM"	Teresa Johnson	2020-01-13	1	-236/+0
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit 2af97be8027a0823b88d4b6a07fc5eedb440bc1f. After attempting to fix bot failures from matching issues (mostly due to inconsistent printing of "llvm::" prefixes on objects, and AnalysisManager objects being printed differntly, I am now seeing some differences I don't understand (real differences in the passes being printed). Giving up at this point to allow the bots to recover. Will revisit later.
*	Add a couple of missed wildcards in debug-pass-manager output checking	Teresa Johnson	2020-01-13	1	-1/+1
\| \| \| \| \| \| \|	Along with the previous fix for bot failures from 2af97be8027a0823b88d4b6a07fc5eedb440bc1f, need to add a wildcard in a couple of places where my local output did not print "llvm::" but the bot is.
*	Hopefully last fix for bot failures	Teresa Johnson	2020-01-13	1	-43/+43
\| \| \| \| \| \| \| \| \|	Hopefully final bot fix for last few failures from 2af97be8027a0823b88d4b6a07fc5eedb440bc1f. Looks like sometimes the "llvm::" preceeding objects get printed in the debug pass manager output and sometimes they don't. Replace with wildcard matching.
*	Try number 2 for fixing bot failures	Teresa Johnson	2020-01-13	1	-8/+8
\| \| \| \| \| \| \| \|	Additional fixes for bot failures from 2af97be8027a0823b88d4b6a07fc5eedb440bc1f. Remove more exact matching on AnalyisManagers, as they can vary. Also allow different orders between LoopAnalysis and BranchProbabilityAnalysis as that can vary due to both being accessed in the parameter list of a call.
*	Fix tests for builtbot failures	Teresa Johnson	2020-01-13	1	-10/+10
\| \| \| \| \| \| \| \| \|	Should fix most of the buildbot failures from 2af97be8027a0823b88d4b6a07fc5eedb440bc1f, by loosening up the matching on the AnalysisProxy output. Added in --dump-input=fail on the one test that appears to be something different, so I can hopefully debug it better.
*	[ThinLTO] Add additional ThinLTO pipeline testing with new PM	Teresa Johnson	2020-01-13	1	-0/+236
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I've added some more extensive ThinLTO pipeline testing with the new PM, motivated by the bug fixed in D72386. I beefed up llvm/test/Other/new-pm-pgo.ll a little so that it tests ThinLTO pre and post link with PGO, similar to the testing for the default pipelines with PGO. Added new pre and post link PGO tests for both instrumentation and sample PGO that exhaustively test the pipelines at different optimization levels via opt. Added a clang test to exhaustively test the post link pipeline invoked for distributed builds. I am currently only testing O2 and O3 since these are the most important for performance. It would be nice to add similar exhaustive testing for full LTO, and for the old PM, but I don't have the bandwidth now and this is a start to cover some of the situations that are not currently default and were under tested. Reviewers: wmi Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, jfb, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72538
*	driver: Allow -fdebug-compilation-dir=foo in joined form.	Nico Weber	2020-01-10	1	-0/+1
\| \| \| \| \| \| \|	All 130+ f_Group flags that take an argument allow it after a '=', except for fdebug-complation-dir. Add a Joined<> alias so that it behaves consistently with all the other f_Group flags. (Keep the old Separate flag for backwards compat.)
*	[Driver][CodeGen] Add -fpatchable-function-entry=N[,0]	Fangrui Song	2020-01-10	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	In the backend, this feature is implemented with the function attribute "patchable-function-entry". Both the attribute and XRay use TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are incompatible. Reviewed By: ostannard, MaskRay Differential Revision: https://reviews.llvm.org/D72222
*	Support function attribute patchable_function_entry	Fangrui Song	2020-01-10	1	-0/+21
\| \| \| \| \| \| \| \| \|	This feature is generic. Make it applicable for AArch64 and X86 because the backend has only implemented NOP insertion for AArch64 and X86. Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D72221
*	Add support for __declspec(guard(nocf))	Andrew Paverd	2020-01-10	1	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a new `"guard_nocf"` function attribute to indicate that checks should not be added on indirect calls within that function. Add support for `__declspec(guard(nocf))` following the same syntax as MSVC. Reviewers: rnk, dmajor, pcc, hans, aaron.ballman Reviewed By: aaron.ballman Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72167
*	[FPEnv] Generate constrained FP comparisons from clang	Ulrich Weigand	2020-01-10	2	-0/+302
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the IRBuilder to generate constrained FP comparisons in CreateFCmp when IsFPConstrained is true, similar to the other places in the IRBuilder. Also, add a new CreateFCmpS to emit signaling FP comparisons, and use it in clang where comparisons are supposed to be signaling (currently, only when emitting code for the <, <=, >, >= operators). Note that there is currently no way to add fast-math flags to a constrained FP comparison, since this is implemented as an intrinsic call that returns a boolean type, and FMF are only allowed for calls returning a floating-point type. However, given the discussion around https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself really shouldn't have any FMF either, so this is probably OK. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71467
*	[ARM,MVE] Make `vqrshrun` generate the right instruction.	Simon Tatham	2020-01-10	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A copy-paste error in `arm_mve.td` meant that the MVE `vqrshrun` intrinsic family was generating the `vqshrun` machine instruction, because in the IR intrinsic call, the rounding flag argument was set to 0 rather than 1. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72496
*	Allow system header to provide their own implementation of some builtin	serge-sans-paille	2020-01-10	2	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a system header provides an (inline) implementation of some of their function, clang still matches on the function name and generate the appropriate llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation « users may not provide their own version of symbols » but doesn't account for the fact that glibc itself can provide inline version of some functions. It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that case an inline version of memcpy calls __memcpy_chk, a function that performs extra runtime checks. Clang currently ignores the inline version and thus provides no runtime check. This code fixes the issue by detecting functions whose name is a builtin name but also have an inline implementation. Differential Revision: https://reviews.llvm.org/D71082
*	[ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP	Wei Mi	2020-01-09	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	down to pass builder in ltobackend. Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang are not passed down to pass builder in ltobackend when new pass manager is used. This is inconsistent with the behavior when new pass manager is used and thinlto is not used. Such inconsistency causes slp vectorization pass not being enabled in ltobackend for O3 + thinlto right now. This patch fixes that. Differential Revision: https://reviews.llvm.org/D72386
*	Add builtins for aligning and checking alignment of pointers and integers	Alex Richardson	2020-01-09	3	-0/+217
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change introduces three new builtins (which work on both pointers and integers) that can be used instead of common bitwise arithmetic: __builtin_align_up(x, alignment), __builtin_align_down(x, alignment) and __builtin_is_aligned(x, alignment). I originally added these builtins to the CHERI fork of LLVM a few years ago to handle the slightly different C semantics that we use for CHERI [1]. Until recently these builtins (or sequences of other builtins) were required to generate correct code. I have since made changes to the default C semantics so that they are no longer strictly necessary (but using them does generate slightly more efficient code). However, based on our experience using them in various projects over the past few years, I believe that adding these builtins to clang would be useful. These builtins have the following benefit over bit-manipulation and casts via uintptr_t: - The named builtins clearly convey the semantics of the operation. While checking alignment using __builtin_is_aligned(x, 16) versus ((x & 15) == 0) is probably not a huge win in readably, I personally find __builtin_align_up(x, N) a lot easier to read than (x+(N-1))&~(N-1). - They preserve the type of the argument (including const qualifiers). When using casts via uintptr_t, it is easy to cast to the wrong type or strip qualifiers such as const. - If the alignment argument is a constant value, clang can check that it is a power-of-two and within the range of the type. Since the semantics of these builtins is well defined compared to arbitrary bit-manipulation, it is possible to add a UBSAN checker that the run-time value is a valid power-of-two. I intend to add this as a follow-up to this change. - The builtins avoids int-to-pointer casts both in C and LLVM IR. In the future (i.e. once most optimizations handle it), we could use the new llvm.ptrmask intrinsic to avoid the ptrtoint instruction that would normally be generated. - They can be used to round up/down to the next aligned value for both integers and pointers without requiring two separate macros. - In many projects the alignment operations are already wrapped in macros (e.g. roundup2 and rounddown2 in FreeBSD), so by replacing the macro implementation with a builtin call, we get improved diagnostics for many call-sites while only having to change a few lines. - Finally, the builtins also emit assume_aligned metadata when used on pointers. This can improve code generation compared to the uintptr_t casts. [1] In our CHERI compiler we have compilation mode where all pointers are implemented as capabilities (essentially unforgeable 128-bit fat pointers). In our original model, casts from uintptr_t (which is a 128-bit capability) to an integer value returned the "offset" of the capability (i.e. the difference between the virtual address and the base of the allocation). This causes problems for cases such as checking the alignment: for example, the expression `if ((uintptr_t)ptr & 63) == 0` is generally used to check if the pointer is aligned to a multiple of 64 bytes. The problem with offsets is that any pointer to the beginning of an allocation will have an offset of zero, so this check always succeeds in that case (even if the address is not correctly aligned). The same issues also exist when aligning up or down. Using the alignment builtins ensures that the address is used instead of the offset. While I have since changed the default C semantics to return the address instead of the offset when casting, this offset compilation mode can still be used by passing a command-line flag. Reviewers: rsmith, aaron.ballman, theraven, fhahn, lebedev.ri, nlopes, aqjune Reviewed By: aaron.ballman, lebedev.ri Differential Revision: https://reviews.llvm.org/D71499
*	[clang] Enforce triple in mempcpy test	serge-sans-paille	2020-01-09	1	-1/+1
\| \| \| \|	Fixes http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/2597
*	[ms] [X86] Use "P" modifier on all branch-target operands in inline X86 ↵	Eric Astor	2020-01-09	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	assembly. Summary: Extend D71677 to apply to all branch-target operands, rather than special-casing call instructions. Also add a regression test for llvm.org/PR44272, since this finishes fixing it. Reviewers: thakis, rnk Reviewed By: thakis Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72417
*	[Clang] Handle target-specific builtins returning aggregates.	Simon Tatham	2020-01-09	1	-0/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: A few of the ARM MVE builtins directly return a structure type. This causes an assertion failure at code-gen time if you try to assign the result of the builtin to a variable, because the `RValue` created in `EmitBuiltinExpr` from the `llvm::Value` produced by codegen is always made by `RValue::get()`, which creates a non-aggregate `RValue` that will fail an assertion when `AggExprEmitter::withReturnValueSlot` calls `Src.getAggregatePointer()`. A similar failure occurs if you try to use the struct return value directly to extract one field, e.g. `vld2q(address).val[0]`. The existing code-gen tests for those MVE builtins pass the returned structure type directly to the C `return` statement, which apparently managed to avoid that particular code path, so we didn't notice the crash. Now `EmitBuiltinExpr` checks the evaluation kind of the builtin's return value, and does the necessary handling for aggregate returns. I've added two extra test cases, both of which crashed before this change. Reviewers: dmgreen, rjmccall Reviewed By: rjmccall Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72271
*	Improve support of GNU mempcpy	serge-sans-paille	2020-01-09	1	-0/+12
\| \| \| \| \| \| \|	- Lower to the memcpy intrinsic - Raise warnings when size/bounds are known Differential Revision: https://reviews.llvm.org/D71374
*	[ARM][MVE] MVE-I should not be disabled by -mfpu=none	Momchil Velikov	2020-01-09	1	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Architecturally, it's allowed to have MVE-I without an FPU, thus -mfpu=none should not disable MVE-I, or moves to/from FP-registers. This patch removes `+/-fpregs` from features unconditionally added to target feature list, depending on FPU and moves the logic to Clang driver, where the negative form (`-fpregs`) is conditionally added to the target features list for the cases of `-mfloat-abi=soft`, or `-mfpu=none` without either `+mve` or `+mve.fp`. Only the negative form is added by the driver, the positive one is derived from other features in the backend. Differential Revision: https://reviews.llvm.org/D71843
*	[ARM,MVE] Intrinsics for variable shift instructions.	Simon Tatham	2020-01-08	1	-0/+1638
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This batch of intrinsics fills in all the shift instructions that take a variable shift distance in a register, instead of an immediate. Some of these instructions take a single shift distance in a scalar register and apply it to all lanes; others take a vector of per-lane distances. These instructions are all basically one family, varying in whether they saturate out-of-range values, and whether they round when bits are shifted off the bottom. I've implemented them at the IR level by a much smaller family of IR intrinsics, which take flag parameters to indicate saturating and/or rounding (along with the usual one to specify signed/unsigned integers). An oddity is that all of them are //left// shift instructions – but if you pass a negative shift count, they'll shift right. So the vector shift distances are always vectors of //signed// integers, regardless of whether you're considering the other input vector to be of signed or unsigned. Also, even the simplest `vshlq` instruction in this family (neither saturating nor rounding) has to be implemented as an IR intrinsic, because the ordinary LLVM IR `shl` operation would consider an out-of-range shift count to be undefined behavior. Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72329
*	[ARM,MVE] Intrinsics for partial-overwrite imm shifts.	Simon Tatham	2020-01-08	1	-0/+1565
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This batch of intrinsics covers two sets of immediate shift instructions, which have in common that they only overwrite part of their output register and so they need an extra input giving its previous value. The VSLI and VSRI instructions shift each lane of the input vector left or right just as if they were normal immediate VSHL/VSHR, but then they only overwrite the output bits that correspond to actual shifted bits of the input. So VSLI will leave the low n bits of each output lane unchanged, and VSRI the same with the top n bits. The V[Q][R]SHR[U]N family are all narrowing shifts: they take an input vector of 2n-bit integers, shift each lane right by a constant, and then narrowing the shifted result to only n bits. So they only overwrite half of the n-bit lanes in the output register, and the B/T suffix indicates whether it's the bottom or top half of each 2n-bit lane. I've implemented the whole of the latter family using a single IR intrinsic `vshrn`, which takes a lot of i32 parameters indicating which instruction it expands to (by specifying signedness of the input and output types, whether it saturates and/or rounds, etc). Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72328
*	Disallow an empty string literal in an asm label	Aaron Ballman	2020-01-08	1	-12/+0
\| \| \| \| \| \| \| \|	An empty string literal in an asm label does not make a whole lot of sense. GCC does not diagnose such a construct, but it also generates code that cannot be assembled by gas should two symbols have an empty asm label within the same TU. This does not affect an asm statement with an empty string literal, which is still a useful construct.
*	[ARM,MVE] Fix many signedness errors in MVE intrinsics.	Simon Tatham	2020-01-06	14	-103/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Running an end-to-end test last week I noticed that a lot of the ACLE intrinsics that operate differently on vectors of signed and unsigned integers were ending up generating the signed version of the instruction unconditionally. This is because the IR intrinsics had no way to distinguish signed from unsigned: the LLVM type system just calls them both `v8i16` (or whatever), so you need either separate intrinsics for signed and unsigned, or a flag parameter that tells ISel which one to choose. This patch fixes all the problems of that kind that I've noticed, by adding an i32 flag parameter to many of the IR intrinsics which is set to 1 for unsigned (matching the existing practice in cases where we got it right), and conditioning all the isel patterns on that flag. So the fundamental change is in `IntrinsicsARM.td`, changing the low-level IR intrinsics API; there are knock-on changes in `arm_mve.td` (adjusting code gen for the ACLE intrinsics to use the modified API) and in `ARMInstrMVE.td` (adjusting isel to expect the new unsigned flags). The rest of this patch is boringly updating tests. Reviewers: dmgreen, miyuki, MarkMurrayARM Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72270
*	[ARM,MVE] Support -ve offsets in gather-load intrinsics.	Simon Tatham	2020-01-06	1	-30/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The ACLE intrinsics with `gather_base` or `scatter_base` in the name are wrappers on the MVE load/store instructions that take a vector of base addresses and an immediate offset. The immediate offset can be up to 127 times the alignment unit, and it can be positive or negative. At the MC layer, we got that right. But in the Sema error checking for the wrapping intrinsics, the offset was erroneously constrained to be positive. To fix this I've adjusted the `imm_mem7bit` class in the Tablegen that defines the intrinsics. But that causes integer literals like `0xfffffffffffffe04` to appear in the autogenerated calls to `SemaBuiltinConstantArgRange`, which provokes a compiler warning because that's out of the non-overflowing range of an `int64_t`. So I've also tweaked `MveEmitter` to emit that as `-0x1fc` instead. Updated the tests of the Sema checks themselves, and also adjusted a random sample of the CodeGen tests to actually use negative offsets and prove they get all the way through code generation without causing a crash. Reviewers: dmgreen, miyuki, MarkMurrayARM Reviewed By: dmgreen Subscribers: kristof.beyls, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72268
*	[SystemZ] Use FNeg in s390x clang builtins	Kevin P. Neal	2020-01-02	4	-22/+22
\| \| \| \|	The s390x builtins are still using FSub instead of FNeg. Correct that.
*	[CodeGen] Emit conj/conjf/confjl libcalls as fneg instructions if possible.	Craig Topper	2019-12-31	2	-7/+25
\| \| \| \| \| \| \|	We already recognize the __builtin versions of these, might as well recognize the libcall version. Differential Revision: https://reviews.llvm.org/D72028
*	[CodeGen] Use IRBuilder::CreateFNeg for __builtin_conj	Craig Topper	2019-12-30	1	-0/+20
\| \| \| \| \| \|	This replaces the fsub -0.0 idiom with an fneg instruction. We didn't see to have a test that showed the current codegen. Just some tests for constant folding and a test that was only checking the declare lines for libcalls. The latter just checked that we did not have a declare for @conj when using __builtin_conj. Differential Revision: https://reviews.llvm.org/D72012
*	[CodeGen] Use CreateFNeg in buildFMulAdd	Craig Topper	2019-12-30	1	-0/+15
\| \| \| \| \| \|	We have an fneg instruction now and should use it instead of the fsub -0.0 idiom. Looks like we had no test that showed that we handled the negation cases here so I've added new tests. Differential Revision: https://reviews.llvm.org/D72010
*	[X86][AsmParser] re-introduce 'offset' operator	Eric Astor	2019-12-30	3	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Amend MS offset operator implementation, to more closely fit with its MS counterpart: 1. InlineAsm: evaluate non-local source entities to their (address) location 2. Provide a mean with which one may acquire the address of an assembly label via MS syntax, rather than yielding a memory reference (i.e. "offset asm_label" and "$asm_label" should be synonymous 3. address PR32530 Based on http://llvm.org/D37461 Fix broken test where the break appears unrelated. - Set up appropriate memory-input rewrites for variable references. - Intel-dialect assembly printing now correctly handles addresses by adding "offset". - Pass offsets as immediate operands (using "r" constraint for offsets of locals). Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D71436
*	Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" ↵	Fangrui Song	2019-12-24	1	-1/+1
\| \| \| \|	as cleanups after D56351
*	[Sema][X86] Consider target attribute into the checks in validateOutputSize ↵	Craig Topper	2019-12-23	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \|	and validateInputSize. The validateOutputSize and validateInputSize need to check whether AVX or AVX512 are enabled. But this can be affected by the target attribute so we need to factor that in. This patch moves some of the code from CodeGen to create an appropriate feature map that we can pass to the function. Differential Revision: https://reviews.llvm.org/D68627
*	reland "[DebugInfo] Support to emit debugInfo for extern variables"	Yonghong Song	2019-12-22	4	-0/+84
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit d77ae1552fc21a9f3877f3ed7e13d631f517c825 ("[DebugInfo] Support to emit debugInfo for extern variables") added deebugInfo for extern variables for BPF target. The commit is reverted by 891e25b02d760d0de18c7d46947913b3166047e7 as the committed tests using %clang instead of %clang_cc1 causing test failed in certain scenarios as reported by Reid Kleckner. This patch fixed the tests by using %clang_cc1. Differential Revision: https://reviews.llvm.org/D71818
*	Revert "[DebugInfo] Support to emit debugInfo for extern variables"	Reid Kleckner	2019-12-22	4	-88/+0
\| \| \| \| \| \| \|	This reverts commit d77ae1552fc21a9f3877f3ed7e13d631f517c825. The tests committed along with this change do not pass, and should be changed to use %clang_cc1.
*	[ms] [X86] Use "P" modifier on operands to call instructions in inline X86 ↵	Eric Astor	2019-12-22	3	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	assembly. Summary: This is documented as the appropriate template modifier for call operands. Fixes PR44272, and adds a regression test. Also adds support for operand modifiers in Intel-style inline assembly. Reviewers: rnk Reviewed By: rnk Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71677
*	[Driver] Verify -mrecord-mcount in Driver, instead of CodeGen after D71627	Fangrui Song	2019-12-21	2	-6/+0
\| \| \| \| \| \| \| \| \| \| \|	GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the option. aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’ Allowing this option can cause failures when building Linux kernel for aarch64, powerpc64, etc, which will think the feature is available if the clang command returns 0.
*	[Clang FE, SystemZ] Recognize -mrecord-mcount CL option.	Jonas Paulsson	2019-12-19	1	-0/+26
\| \| \| \| \| \| \| \| \| \|	Recognize -mrecord-mcount from the command line and add a function attribute "mrecord-mcount" when passed. Only valid on SystemZ (when used with -mfentry). Review: Ulrich Weigand https://reviews.llvm.org/D71627
*	[WebAssembly] Add avgr_u intrinsics and require nuw in patterns	Thomas Lively	2019-12-18	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The vector pattern `(a + b + 1) / 2` was previously selected to an avgr_u instruction regardless of nuw flags, but this is incorrect in the case where either addition may have an unsigned wrap. This CL changes the existing pattern to require both adds to have nuw flags and adds builtin functions and intrinsics for the avgr_u instructions because the corrected pattern is not representable in C. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71648
*	[Clang FE, SystemZ] Don't add "true" value for the "mnop-mcount" attribute.	Jonas Paulsson	2019-12-18	1	-2/+2
\| \| \| \| \| \| \| \|	Let the "mnop-mcount" function attribute simply be present or non-present. Update SystemZ backend as well to use hasFnAttribute() instead. Review: Ulrich Weigand https://reviews.llvm.org/D71669
*	Add support for the MS qualifiers __ptr32, __ptr64, __sptr, __uptr.	Amy Huang	2019-12-18	1	-0/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds parsing of the qualifiers __ptr32, __ptr64, __sptr, and __uptr and lowers them to the corresponding address space pointer for 32-bit and 64-bit pointers. (32/64-bit pointers added in https://reviews.llvm.org/D69639) A large part of this patch is making these pointers ignore the address space when doing things like overloading and casting. https://bugs.llvm.org/show_bug.cgi?id=42359 Reviewers: rnk, rsmith Subscribers: jholewinski, jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D71039
*	[FPEnv] Remove unnecessary rounding mode argument for constrained intrinsics	Ulrich Weigand	2019-12-17	1	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following intrinsics currently carry a rounding mode metadata argument: llvm.experimental.constrained.minnum llvm.experimental.constrained.maxnum llvm.experimental.constrained.ceil llvm.experimental.constrained.floor llvm.experimental.constrained.round llvm.experimental.constrained.trunc This is not useful since the semantics of those intrinsics do not in any way depend on the rounding mode. In similar cases, other constrained intrinsics do not have the rounding mode argument. Remove it here as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71218
*	[Clang FE, SystemZ] Recognize -mpacked-stack CL option	Jonas Paulsson	2019-12-17	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \|	Recognize -mpacked-stack from the command line and add a function attribute "mpacked-stack" when passed. This is needed for building the Linux kernel. If this option is passed for any other target than SystemZ, an error is generated. Review: Ulrich Weigand https://reviews.llvm.org/D71441
*	Reland [NFC-I] Remove hack for fp-classification builtins	Erich Keane	2019-12-17	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The FP-classification builtins (__builtin_isfinite, etc) use variadic packs in the definition file to mean an overload set. Because of that, floats were converted to doubles, which is incorrect. There WAS a patch to remove the cast after the fact. THis patch switches these builtins to just be custom type checking, calls the implicit conversions for the integer members, and makes sure the correct L->R casts are put into place, then does type checking like normal. A future direction (that wouldn't be NFC) would consider making conversions for the floating point parameter legal. Note: The initial patch for this missed that certain systems need to still convert half to float, since they dont' support that type.
*	[WebAssembly] Setting export_name implies llvm.used	Sam Clegg	2019-12-16	1	-0/+2
\| \| \| \| \| \| \|	This change updates the clang front end to add symbols to llvm.used when they have explicit export_name attribute. Differential Revision: https://reviews.llvm.org/D71493
*	[WebAssembly] Replace SIMD int min/max builtins with patterns	Thomas Lively	2019-12-16	1	-84/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The instructions were originally implemented via builtins and intrinsics so users would have to explicitly opt-in to using them. This was useful while were validating whether these instructions should have been merged into the spec proposal. Now that they have been, we can use normal codegen patterns, so the intrinsics and builtins are no longer useful. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71500
*	Fix floating point builtins to not promote float->double	Erich Keane	2019-12-16	2	-1/+72
\| \| \| \| \| \| \| \| \| \| \|	As brought up in D71467, a group of floating point builtins automatically promoted floats to doubles because they used the variadic builtin tag to support an overload set. The result is that the parameters were treated as a variadic pack, which always promots float->double. This resulted in the wrong answer being given in cases with certain values of NaN.