bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Improve code size cost model (part 2)	dfukalov	2019-11-06	12	-20/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added estimations for ShuffleVector, some cast and arithmetic instructions Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69629
*	NeonEmitter: remove special 'a' type modifier.	Tim Northover	2019-11-06	5	-115/+74
\| \| \| \| \| \| \| \|	'a' used to implement a splat in C++ code in NeonEmitter.cpp, but this can be done directly from .td expansions now (and most ops already did). So removing it simplifies the overall code. https://reviews.llvm.org/D69716
*	[TTI][LV] preferPredicateOverEpilogue	Sjoerd Meijer	2019-11-06	9	-7/+186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have two ways to steer creating a predicated vector body over creating a scalar epilogue. To force this, we have 1) a command line option and 2) a pragma available. This adds a third: a target hook to TargetTransformInfo that can be queried whether predication is preferred or not, which allows the vectoriser to make the decision without forcing it. While this change behaves as a non-functional change for now, it shows the required TTI plumbing, usage of this new hook in the vectoriser, and the beginning of an ARM MVE implementation. I will follow up on this with: - a complete MVE implementation, see D69845. - a patch to disable this, i.e. we should respect "vector_predicate(disable)" and its corresponding loophint. Differential Revision: https://reviews.llvm.org/D69040
*	NeonEmitter: switch to enum for internal Type representation.	Tim Northover	2019-11-06	1	-101/+90
\| \| \| \| \| \| \| \| \|	Previously we had a handful of bools (Signed, Floating, ...) that could easily end up in an inconsistent state. This adds an enum Kind which holds the mutually exclusive states a type might be in, retaining some of the bools that modified an underlying type. https://reviews.llvm.org/D69715
*	[Syntax] Add nodes for most common statements	Ilya Biryukov	2019-11-06	5	-33/+932
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Most of the statements mirror the ones provided by clang AST. Major differences are: - expressions are wrapped into 'ExpressionStatement' instead of being a subclass of statement, - semicolons are always consumed by the leaf expressions (return, expression satement, etc), - some clang statements are not handled yet, we wrap those into an UnknownStatement class, which is not present in clang. We also define an 'Expression' and 'UnknownExpression' classes in order to produce 'ExpressionStatement' where needed. The actual implementation of expressions is not yet ready, it will follow later. Reviewers: sammccall Reviewed By: sammccall Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63835
*	[ARM,MVE] Add intrinsics for gather/scatter load/stores.	Simon Tatham	2019-11-06	8	-74/+4599
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds two new families of intrinsics, both of which are memory accesses taking a vector of locations to load from / store to. The vldrq_gather_base / vstrq_scatter_base intrinsics take a vector of base addresses, and an immediate offset to be added consistently to each one. vldrq_gather_offset / vstrq_scatter_offset take a scalar base address, and a vector of offsets to add to it. The 'shifted_offset' variants also multiply each offset by the element size type, so that the vector is effectively of array indices. At the IR level, these operations are represented by a single set of four IR intrinsics: {gather,scatter} × {base,offset}. The other details (signed/unsigned, shift, and memory element size as opposed to vector element size) are all specified by IR intrinsic polymorphism and immediate operands, because that made the selection job easier than making a huge family of similarly named intrinsics. I considered using the standard IR representations such as llvm.masked.gather, but they're not a good fit. In order to use llvm.masked.gather to represent a gather_offset load with element size smaller than a pointer, you'd have to expand the <8 x i16> vector of offsets into an <8 x i16*> vector of pointers, which would be split up during legalization, so you'd spend most of your time undoing the mess it had made. Also, ISel support for llvm.masked.gather would be easy enough in a trivial way (you can expand it into a gather-base load with a zero immediate offset), but instruction-selecting lots of fiddly idioms back into all the _other_ MVE load instructions would be much more work. So I think dedicated IR intrinsics are the more sensible approach, at least for the moment. On the clang tablegen side, I've added two new features to the Tablegen source accepted by MveEmitter: a 'CopyKind' type node for defining a type that varies with the parameter type (it lets you ask for an unsigned integer type of the same width as the parameter), and an 'unsignedflag' value node for passing an immediate IR operand which is 0 for a signed integer type or 1 for an unsigned one. That lets me write each kind of intrinsic just once and get all its subtypes and immediate arguments generated automatically. Also I've tweaked the handling of pointer-typed values in the code generation part of MveEmitter: they're generated as Address rather than Value (i.e. including an alignment) so that they can be given to the ordinary IR load and store operations, but I'd omitted the code to convert them back to Value when they're going to be used as an argument to an IR intrinsic. On the MC side, I've enhanced MVEVectorVTInfo so that it can tell you not only the full assembly-language suffix for a given vector type (like 's32' or 'u16') but also the numeric-only one used by store instructions (just '32' or '16'). Reviewers: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69791
*	[ARM,MVE] Integer-type nitpicks in MVE intrinsics.	Simon Tatham	2019-11-06	3	-5/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A few integer types in the ACLE definitions of MVE intrinsics are given as 'int' or 'unsigned' instead of <stdint.h> fixed-size types like uint32_t. Usually these are the ones where the size isn't that important, such as immediate offsets in loads (which have a range limited by the instruction encoding) or the carry flag in vadcq which can only be 0 or 1 anyway. With this change, <arm_mve.h> follows that exact type naming, so that the function prototypes look identical to the ones in ACLE, instead of replacing int and unsigned with int32_t and uint32_t. Reviewers: dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69790
*	[clang,MveEmitter] Fix sign/zero extension in range limits.	Simon Tatham	2019-11-06	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the code that generates Sema range checks on constant arguments, I had a piece of code that checks the bounds specified in the Tablegen intrinsic description against the range of the integer type being tested. If the bounds are large enough to permit any value of the integer type, you can omit the compile-time range check. (This case is expected to come up in some of the bitwise operation intrinsics.) But somehow I got my signed/unsigned check backwards (asking for the signed min/max of an unsigned type and vice versa), and also made a sign extension error in which a signed negative value gets zero-extended. Now rewritten more sensibly, and it should get its first sensible test from the next batch of intrinsics I'm planning to add in D69791. Reviewers: dmgreen Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69789
*	[ARM MVE] Remove accidental 64-bit vst2/vld2 intrinsics.	Simon Tatham	2019-11-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ACLE defines no such intrinsic as vst2q_u64, and the MVE instruction set has no corresponding instruction. But I had accidentally added them to the fledgling <arm_mve.h> anyway, and if you used them, you'd get a compiler crash. Reviewers: dmgreen Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69788
*	[clangd] Implement a function to lex the file to find candidate occurrences.	Haojian Wu	2019-11-06	3	-17/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will be used for incoming cross-file rename (to detect index staleness issue). Reviewers: ilya-biryukov Subscribers: MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69615
*	clang-format: Add a fallback style to Emacs mode	paulhoad	2019-11-06	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows one to enable `clang-format-buffer` on file save and avoid reformatting files that are outside of any project with .clang-format style. Reviewers: djasper, klimek, sammccall, owenpan, mitchell-stellar, MyDeveloperDay Reviewed By: MyDeveloperDay Subscribers: cfe-commits Patch By: dottedmag Tags: #clang, #clang-format Differential Revision: https://reviews.llvm.org/D69752
*	[clang-format] [PR35518] C++17 deduction guides are wrongly formatted	paulhoad	2019-11-06	2	-0/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: see https://bugs.llvm.org/show_bug.cgi?id=35518 clang-format removes spaces around deduction guides but not trailing return types, make the consistent ``` template <typename T> S(T)->S<T>; auto f(int, int) -> double; ``` becomes ``` template <typename T> S(T) -> S<T>; auto f(int, int) -> double; ``` Reviewers: klimek, mitchell-stellar, owenpan, sammccall, lichray, curdeius, KyrBoh Reviewed By: curdeius Subscribers: merge_guards_bot, hans, lichray, cfe-commits Tags: #clang-format, #clang-tools-extra, #clang Differential Revision: https://reviews.llvm.org/D69577
*	gn build: Merge 24130d661ed	LLVM GN Syncbot	2019-11-06	1	-0/+1
\|
*	[clang-tidy] Add readability-make-member-function-const	Matthias Gehre	2019-11-06	8	-0/+708
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Finds non-static member functions that can be made ``const`` because the functions don't use ``this`` in a non-const way. The check conservatively tries to preserve logical costness in favor of physical costness. See readability-make-member-function-const.rst for more details. Reviewers: aaron.ballman, gribozavr, hokein, alexfh Subscribers: mgorny, xazax.hun, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D68074
*	YAML parser robustness improvements	Thomas Finch	2019-11-05	3	-14/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch fixes a number of bugs found in the YAML parser through fuzzing. In general, this makes the parser more robust against malformed inputs. The fixes are mostly improved null checking and returning errors in more cases. In some cases, asserts were changed to regular errors, this provides the same robustness but also protects release builds from the triggering conditions. This also improves the fuzzability of the YAML parser since asserts can act as a roadblock to further fuzzing once they're hit. Each fix has a corresponding test case: - TestAnchorMapError - Added proper null pointer handling in `Stream::printError` if N is null and `KeyValueNode::getValue` if getKey returns null, `Input::createHNodes` `dyn_casts` changed to `dyn_cast_or_null` so the null pointer checks are actually able to fail - TestFlowSequenceTokenErrors - Added case in `Document::parseBlockNode` for FlowMappingEnd, FlowSequenceEnd, or FlowEntry tokens outside of mappings or sequences - TestDirectiveMappingNoValue - Changed assert to regular error return in `Scanner::scanValue` - TestUnescapeInfiniteLoop - Fixed infinite loop in `ScalarNode::unescapeDoubleQuoted` by returning an error for unrecognized escape codes - TestScannerUnexpectedCharacter - Changed asserts to regular error returns in `Scanner::consume` - TestUnknownDirective - For both of the inputs the stream doesn't fail and correctly returns TK_Error, but there is no valid root node for the document. There's no reasonable way to make the scanner fail for unknown directives without breaking the YAML spec (see spec-07-01.test). I think the assert is unnecessary given that an error is still generated for this case. The `SimpleKeys.clear()` line fixes a bug found by AddressSanitizer triggered by multiple test cases - when TokenQueue is cleared SimpleKeys is still holding dangling pointers into it, so SimpleKeys should be cleared as well. Patch by Thomas Finch! Reviewers: chandlerc, Bigcheese, hintonda Reviewed By: Bigcheese, hintonda Subscribers: hintonda, kristina, beanz, dexonsmith, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61608
*	[ADT] Add equality operator for SmallPtrSet	Yevgeny Rouban	2019-11-06	2	-0/+64
\| \| \| \| \|	Reviewed By: tellenbach Differential Revision: https://reviews.llvm.org/D69429
*	[PowerPC] Fix the incorrect 'RM' flag set on load/store instr	QingShan Zhang	2019-11-06	2	-1/+10
\| \| \| \| \| \|	The 'RM' flag model the "Rounding Mode" and it has nothing to do with the load/store instructions. Differential Revision: https://reviews.llvm.org/D69551
*	Implement `sys::getHostCPUName()` for Darwin ARM	Chris Bieneman	2019-11-05	1	-1/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently there is no implementation of `sys::getHostCPUName()` for Darwin ARM targets. This patch makes it so that LLVM running on ARM makes reasonable guesses about the CPU features of the host CPU. Reviewers: t.p.northover, lhames, efriedma Reviewed By: efriedma Subscribers: rjmccall, efriedma, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69597
*	Fixed a profdata file size detection on Windows system.	Vladimir Vereschaka	2019-11-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The space symbols are allowed in the group names on Windows system (as example: Domain Users). In that case the test extracts a wrong field from the output to get a size of the profdata file. This patch avoids a printing of the group names in the test output and extracts a proper field as a file size. Differential Revision: https://reviews.llvm.org/D69317
*	[IRMover] Set Address Space for moved global values	Teresa Johnson	2019-11-05	2	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Set Address Space when creating a new function (from another). Fix PR41154. Patch by Ehud Katz <ehudkatz@gmail.com> Reviewers: tejohnson, chandlerc Reviewed By: tejohnson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69361
*	[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode ↵	Daniel Sanders	2019-11-05	2	-22/+38
\| \| \| \| \| \| \| \| \|	reference It looks like I pushed an older version of this commit without the review fixups earlier. This applies the review changes Differential Revision: https://reviews.llvm.org/D69545
*	[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode ↵	Daniel Sanders	2019-11-05	7	-76/+803
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	reference Summary: Rework the GMIR documentation to focus more on the end user than the implementation and tie it in to the MIR document. There was also some out-of-date information which has been removed. The quality of the GenericOpcode reference is highly variable and drops sharply as I worked through them all but we've got to start somewhere :-). It would be great if others could expand on this too as there is an awful lot to get through. Also fix a typo in the definition of G_FLOG. Previously, the comments said we had two base-2's (G_FLOG and G_FLOG2). Reviewers: aemerson, volkan, rovka, arsenm Reviewed By: rovka Subscribers: wdng, arphaman, jfb, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69545
*	[Automaton] Make Automaton thread-safe	James Molloy	2019-11-05	1	-1/+11
\| \| \| \| \| \| \| \| \| \|	In an optimization to improve performance (rL375240) we added a std::shared_ptr around the main table map. This is safe, but we also ended up making the transcriber object a std::shared_ptr too. This has mutable state, so must be copied when we copy the Automaton object. This is very cheap; the main optimization was about the map `M` only. Reported by Dan Palermo. No test as triggering this is rather hard from a unit test.
*	[globalisel][docs] Add a section about debugging with the block extractor	Daniel Sanders	2019-11-05	2	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Depends on D69644 Reviewers: rovka, volkan, arsenm Subscribers: wdng, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69645
*	[AMDGPU] Add missing flags to DS_Real	Stanislav Mekhanoshin	2019-11-05	1	-0/+2
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D69867
*	[SLP] add tests for 2-wide reductions; NFC	Sanjay Patel	2019-11-05	1	-2/+115
\|
*	[TestMTCSimple] Disable the test if you don't have libMTC	Alex Langford	2019-11-05	1	-1/+5
\| \| \| \| \| \|	If you are running on macOS and have the CommandLineTools installed of Xcode, this test will fail because CommandLineTools doesn't ship with libMainThreadChecker. Skip the test if you don't have it installed.
*	Revert "[analyzer] Add test directory for scan-build."	Volodymyr Sapsai	2019-11-05	10	-152/+1
\| \| \| \| \| \| \| \|	This reverts commit 0aba69eb1a01c44185009f50cc633e3c648e9950 with subsequent changes to test files. It caused test failures on GreenDragon, e.g., http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/
*	[IRMover] Use GlobalValue::getAddressSpace instead of directly from its type ↵	Teresa Johnson	2019-11-05	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[NFC] Summary: Change the old form of G->getType()->getAddressSpace() to the new G->getAddressSpace() (underneath does the same). Patch by Ehud Katz <ehudkatz@gmail.com> Reviewers: tejohnson, chandlerc Reviewed By: tejohnson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69550
*	[mips] Fix `getRegForInlineAsmConstraint` to do not crash on empty Constraint	Simon Atanasyan	2019-11-06	2	-4/+20
\|
*	[CMake] Prevent adding lld to test dependency (TEST_DEPS) when lld project ↵	Kelvin Li	2019-11-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	is not built D69405 causes failure if running LIT when the compiler was built without lld. Patch by Anh Tuyen Tran (anhtuyen) Differential Revision: https://reviews.llvm.org/D69685
*	[LoopRotationUtils] Check values are newly inserted into maps.	Alina Sbirlea	2019-11-05	1	-5/+14
\| \| \| \| \|	This is a cleanup that came up in D63680. All values added to the ValueMaps should be newly added.
*	[Hexagon] getCompoundCandidateGroup - fix 'false' value is implicitly cast ↵	Simon Pilgrim	2019-11-05	1	-5/+5
\| \| \| \| \| \|	to unsigned warning. NFCI. Consistently return HexagonII::HCG_None.
*	[lldb] Add a install target for lldb python on darwin	Haibo Huang	2019-11-05	1	-12/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Similar to D68370 but for darwin framework build. Reviewers: aadsm Subscribers: mgorny, lldb-commits Tags: #lldb Differential Revision: https://reviews.llvm.org/D69834
*	[X86/Atomics] Correct a few transforms for new atomic lowering	Philip Reames	2019-11-05	1	-4/+3
\| \| \| \| \| \|	This is a partial fix for the issues described in commit message of 027aa27 (the revert of G24609). Unfortunately, I can't provide test coverage for it on it's own as the only (known) wrong example is still wrong, but due to a separate issue. These fixes are cases where when performing unrelated DAG combines, we were dropping the atomicity flags entirely.
*	Fix typo so that '-O0' is correctly specified	Bill Wendling	2019-11-05	1	-3/+3
\|
*	[OPENMP50]Simplify processing of context selector scores.	Alexey Bataev	2019-11-05	6	-79/+32
\| \| \| \| \| \|	If the context selector score was not specified, its value must be set to 0. Simplify the processing of unspecified scores + save memory in attribute representation.
*	[MIR] Add MIR parsing for heap alloc site instruction markers	Amy Huang	2019-11-05	7	-5/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds MIR parsing and printing for heap alloc markers, which were added in D69136. They are printed as an operand similar to pre-/post-instr symbols, with a heap-alloc-marker token and a metadata node. Reviewers: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69864
*	[Sema] Fixes templated friend member assertion	Mark de Wever	2019-11-05	2	-1/+13
\| \| \| \| \| \|	Fixes PR41792: Clang assertion failure on templated friend member function Differential Revision: https://reviews.llvm.org/D69481
*	[ValueObject] Upstream early exit from swift-lldb. (NFC)	Adrian Prantl	2019-11-05	1	-0/+4
\|
*	[ValueObject] Upstream initialization from swift-lldb.	Adrian Prantl	2019-11-05	1	-0/+10
\| \| \| \| \| \| \|	This is a non-Swift-specific change in swift-lldb that seems to be useful for remote debugging. If does in fact turn out to be redundant we can remove it from llvm.org and then it will disappear in swift-lldb, too.
*	[Reproducer] Add test case for expression evaluation	Jonas Devlieghere	2019-11-05	2	-0/+28
\|
*	[X86] Gate select->fmin/fmax transform on NoSignedZeros instead of UnsafeFPMath	Benjamin Kramer	2019-11-05	2	-9/+8
\|
*	TestBatchMode.py: add missing @skipIfRemote	Fred Riss	2019-11-05	1	-0/+1
\| \| \| \| \|	All the tests in this file were already marked as skipped for remote tests except for this one.
*	testsuite: skipIfNoSBHeaders should skip when running remotely	Fred Riss	2019-11-05	1	-0/+3
\| \| \| \| \|	The LLDB dylib/framework will not be available on the remote host, it makes no sense to try to run those tests in a remote scenario.
*	Modernize add-dsym test Makefile	Fred Riss	2019-11-05	1	-17/+7
\|
*	Revert "[lit] Better/earlier errors when no tests are executed"	Julian Lettner	2019-11-05	4	-29/+9
\| \| \| \|	This reverts commit d8f2bff75126c6dde694ad245f9807fa12ad5630.
*	[AMDGPU] Removed dead code from R600ISelLowering.cpp	Stanislav Mekhanoshin	2019-11-05	1	-6/+1
\| \| \| \| \| \| \| \| \|	This was added to inhibit a warning from gcc 7.3 according to the comment. However, it triggers warning from PVS. In addition I cannot reproduce it with gcc 7.4 and I also cannot reproduce it with gcc 7.3 using compiler explorer. Differential Revision: https://reviews.llvm.org/D69863
*	[X86/Atomics] (Semantically) revert G246098, switch back to the old atomic ↵	Philip Reames	2019-11-05	4	-47/+168
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	example When writing an email for a follow up proposal, I realized one of the diffs in the committed change was incorrect. Digging into it revealed that the fix is complicated enough to require some thought, so reverting in the meantime. The problem is visible in this diff (from the revert): ; X64-SSE-LABEL: store_fp128: ; X64-SSE: # %bb.0: -; X64-SSE-NEXT: movaps %xmm0, (%rdi) +; X64-SSE-NEXT: subq $24, %rsp +; X64-SSE-NEXT: .cfi_def_cfa_offset 32 +; X64-SSE-NEXT: movaps %xmm0, (%rsp) +; X64-SSE-NEXT: movq (%rsp), %rsi +; X64-SSE-NEXT: movq {{[0-9]+}}(%rsp), %rdx +; X64-SSE-NEXT: callq __sync_lock_test_and_set_16 +; X64-SSE-NEXT: addq $24, %rsp +; X64-SSE-NEXT: .cfi_def_cfa_offset 8 ; X64-SSE-NEXT: retq store atomic fp128 %v, fp128* %fptr unordered, align 16 ret void The problem here is three fold: 1) x86-64 doesn't guarantee atomicity of anything larger than 8 bytes. Some platforms observably break this guarantee, others don't, but the codegen isn't considering this, so it's wrong on at least some platforms. 2) When I started to track down the problem, I discovered that DAGCombiner had stripped the atomicity off the store entirely. This comes down to idiomatic usage of DAG.getStore passing all MMO components separately as opposed to just passing the MMO. 3) On x86 (not -64), there are cases where 8 byte atomiciy is supported, but only for floating point operations. This would seem to imply that operation typing matters for correctness, and DAGCombine happily folds away bitcasts. I'm not 100% sure there's a problem here, but I'm not entirely sure there isn't either. I plan on returning to each issue in turn; sorry for the churn here.
*	[HIP] Fix visibility for 'extern' device variables.	Michael Liao	2019-11-05	2	-3/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Fix a bug which misses the change for a variable to be set with target-specific attributes. Reviewers: yaxunl Subscribers: jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63020