bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[MemCpyOpt] Use LocationSize instead of ints; NFC	George Burgess IV	2018-12-23	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Trying to keep these patches super small so they're easily post-commit verifiable, as requested in D44748. srcSize is derived from the size of an alloca, and we quit out if the size of that is > the size of the thing we're copying to. Hence, we should always copy everything over, so these sizes are precise. Don't make srcSize itself a LocationSize, since optionality isn't helpful, and we do some comparisons against other sizes elsewhere in that function. llvm-svn: 350019
*	[llvm] API for encoding/decoding DWARF discriminators.	Mircea Trofin	2018-12-21	4	-12/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added a pair of APIs for encoding/decoding the 3 components of a DWARF discriminator described in http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html: the base discriminator, the duplication factor (useful in profile-guided optimization) and the copy index (used to identify copies of code in cases like loop unrolling) The encoding packs 3 unsigned values in 32 bits. This CL addresses 2 issues: - communicates overflow back to the user - supports encoding all 3 components together. Current APIs assume a sequencing of events. For example, creating a new discriminator based on an existing one by changing the base discriminator was not supported. Reviewers: davidxl, danielcdh, wmi, dblaikie Reviewed By: dblaikie Subscribers: zzheng, dmgreen, aprantl, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D55681 llvm-svn: 349973
*	[IR] Add Instruction::isLifetimeStartOrEnd, NFC	Vedant Kumar	2018-12-21	10	-41/+19
\| \| \| \| \| \| \| \| \| \| \|	Instruction::isLifetimeStartOrEnd() checks whether an Instruction is an llvm.lifetime.start or an llvm.lifetime.end intrinsic. This was suggested as a cleanup in D55967. Differential Revision: https://reviews.llvm.org/D56019 llvm-svn: 349964
*	[RuntimeUnrolling] NFC: Add TODO and comments in connectProlog	Anna Thomas	2018-12-21	1	-0/+18
\| \| \| \| \| \| \| \|	Currently, runtime unrolling does not support loops where multiple exiting blocks exit to the latchExit. Added TODO and other code clarifications for ConnectProlog code. llvm-svn: 349944
*	[X86][SSE] Auto upgrade PADDS/PSUBS intrinsics to SADD_SAT/SSUB_SAT generic ↵	Simon Pilgrim	2018-12-21	1	-78/+0
\| \| \| \| \| \| \| \| \| \| \| \|	intrinsics (llvm) This auto upgrades the signed SSE saturated math intrinsics to SADD_SAT/SSUB_SAT generic intrinsics. Clang counterpart: https://reviews.llvm.org/D55890 Differential Revision: https://reviews.llvm.org/D55894 llvm-svn: 349892
*	[memcpyopt] Add debug logs when forwarding memcpy src to dst	Reid Kleckner	2018-12-21	1	-0/+2
\| \| \| \|	llvm-svn: 349873
*	[LoopUnroll] Don't verify domtree by default with +Asserts.	Eli Friedman	2018-12-21	2	-3/+5
\| \| \| \| \| \| \| \| \| \|	This verification is linear in the size of the function, so it can cause a quadratic compile-time explosion in a function with many loops to unroll. Differential Revision: https://reviews.llvm.org/D54732 llvm-svn: 349871
*	cmake: Remove add_llvm_loadable_module()	Tom Stellard	2018-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This function is very similar to add_llvm_library(), so this patch merges it into add_llvm_library() and replaces all calls to add_llvm_loadable_module(lib ...) with add_llvm_library(lib MODULE ...) Reviewers: philip.pfaffe, beanz, chandlerc Reviewed By: philip.pfaffe Subscribers: chapuni, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D51748 llvm-svn: 349839
*	[InstCombine] Preserve access-group metadata.	Michael Kruse	2018-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \|	Preserve llvm.access.group metadata when combining store instructions. This was forgotten in r349725. Fixes llvm.org/PR40117 llvm-svn: 349774
*	[InstCombine][AMDGPU] Handle more buffer intrinsics	Piotr Sobczak	2018-12-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Include the following intrinsics in the InsctCombine simplification: * amdgcn_raw_buffer_load * amdgcn_raw_buffer_load_format * amdgcn_struct_buffer_load * amdgcn_struct_buffer_load_format Change-Id: I14deceff74bcb21179baf6aa6e94bf39e7d63d5d Reviewers: arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55882 llvm-svn: 349735
*	[MSan] Don't emit __msan_instrument_asm_load() calls	Alexander Potapenko	2018-12-20	1	-10/+4
\| \| \| \| \| \| \| \| \| \| \|	LLVM treats void* pointers passed to assembly routines as pointers to sized types. We used to emit calls to __msan_instrument_asm_load() for every such void*, which sometimes led to false positives. A less error-prone (and truly "conservative") approach is to unpoison only assembly output arguments. llvm-svn: 349734
*	[HWASAN] Add support for memory intrinsics	Eugene Leviant	2018-12-20	1	-0/+42
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D55117 llvm-svn: 349728
*	Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.	Michael Kruse	2018-12-20	12	-59/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725
*	[asan] Undo special treatment of linkonce_odr and weak_odr	Vitaly Buka	2018-12-20	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: On non-Windows these are already removed by ShouldInstrumentGlobal. On Window we will wait until we get actual issues with that. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55899 llvm-svn: 349707
*	[asan] Prevent folding of globals with redzones	Vitaly Buka	2018-12-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: ICF prevented by removing unnamed_addr and local_unnamed_addr for all sanitized globals. Also in general unnamed_addr is not valid here as address now is important for ODR violation detector and redzone poisoning. Before the patch ICF on globals caused: 1. false ODR reports when we register global on the same address more than once 2. globals buffer overflow if we fold variables of smaller type inside of large type. Then the smaller one will poison redzone which overlaps with the larger one. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55857 llvm-svn: 349706
*	Revert "[BDCE][DemandedBits] Detect dead uses of undead instructions"	Nikita Popov	2018-12-19	1	-23/+15
\| \| \| \| \| \| \|	This reverts commit r349674. It causes a failure in test-suite enc-3des.execution_time. llvm-svn: 349684
*	[BDCE][DemandedBits] Detect dead uses of undead instructions	Nikita Popov	2018-12-19	1	-15/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The test case has a couple of cases that are not simplified yet. In particular, we're only looking at uses of instructions right now. I think it would make sense to also extend this to arguments. Furthermore DemandedBits doesn't yet know some of the tricks that InstCombine does for the demanded bits or bitwise or/and/xor in combination with known bits information. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 349674
*	Test commit	Anton Afanasyev	2018-12-19	1	-4/+4
\| \| \| \| \| \|	Fix typos. llvm-svn: 349644
*	[asan] Restore ODR-violation detection on vtables	Vitaly Buka	2018-12-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: unnamed_addr is still useful for detecting of ODR violations on vtables Still unnamed_addr with lld and --icf=safe or --icf=all can trigger false reports which can be avoided with --icf=none or by using private aliases with -fsanitize-address-use-odr-indicator Reviewers: eugenis Reviewed By: eugenis Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55799 llvm-svn: 349555
*	[asan] In llvm.asan.globals, allow entries to be non-GlobalVariable and skip ↵	Kuba Mracek	2018-12-18	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	over them Looks like there are valid reasons why we need to allow bitcasts in llvm.asan.globals, see discussion at https://github.com/apple/swift-llvm/pull/133. Let's look through bitcasts when iterating over entries in the llvm.asan.globals list. Differential Revision: https://reviews.llvm.org/D55794 llvm-svn: 349544
*	Change the objc ARC optimizer to use the new objc.* intrinsics	Pete Cooper	2018-12-18	2	-59/+17
\| \| \| \| \| \| \| \| \| \| \|	We're moving ARC optimisation and ARC emission in clang away from runtime methods and towards intrinsics. This is the part which actually uses the intrinsics in the ARC optimizer when both analyzing the existing calls and emitting new ones. Differential Revision: https://reviews.llvm.org/D55348 Reviewers: ahatanak llvm-svn: 349534
*	[InstCombine] Simplify cttz/ctlz + icmp eq/ne into mask check	Nikita Popov	2018-12-18	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Checking whether a number has a certain number of trailing / leading zeros means checking whether it is of the form XXXX1000 / 0001XXXX, which can be done with an and+icmp. Related to https://bugs.llvm.org/show_bug.cgi?id=28668. As a next step, this can be extended to non-equality predicates. Differential Revision: https://reviews.llvm.org/D55745 llvm-svn: 349530
*	[SCCP] Get rid of redundant call for getPredicateInfoFor (NFC).	Florian Hahn	2018-12-18	1	-1/+1
\| \| \| \| \| \|	We can use the result fetched a few lines above. llvm-svn: 349527
*	[InstCombine] refactor isCheapToScalarize(); NFC	Sanjay Patel	2018-12-18	1	-33/+25
\| \| \| \| \| \| \| \| \|	As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523
*	[LoopVectorize] Rename pass options. NFC.	Michael Kruse	2018-12-18	3	-16/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename: NoUnrolling to InterleaveOnlyWhenForced and AlwaysVectorize to !VectorizeOnlyWhenForced Contrary to what the name 'AlwaysVectorize' suggests, it does not unconditionally vectorize all loops, but applies a cost model to determine whether vectorization is profitable to all loops. Hence, passing false will disable the cost model, except when a loop is marked with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested by @hfinkel in D55716) better matches this behavior. Similarly, 'NoUnrolling' disables the profitability cost model for interleaving (a term to distinguish it from unrolling by the LoopUnrollPass); rename it for consistency. Differential Revision: https://reviews.llvm.org/D55785 llvm-svn: 349513
*	[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.	Michael Kruse	2018-12-18	2	-33/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics. This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses. The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply. Differential Revision: https://reviews.llvm.org/D55716 llvm-svn: 349509
*	[IPO][AVR] Create new Functions in the default address space specified in ↵	Dylan McKay	2018-12-18	8	-17/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the data layout This modifies the IPO pass so that it respects any explicit function address space specified in the data layout. In targets with nonzero program address spaces, all functions should, by default, be placed into the default program address space. This is required for Harvard architectures like AVR. Without this, the functions will be marked as residing in data space, and thus not be callable. This has no effect to any in-tree official backends, as none use an explicit program address space in their data layouts. Patch by Tim Neumann. llvm-svn: 349469
*	SROA: preserve alignment tags on loads and stores.	Tim Northover	2018-12-18	1	-16/+43
\| \| \| \| \| \| \| \| \| \| \|	When splitting up an alloca's uses we were dropping any explicit alignment tags, which means they default to the ABI-required default alignment and this can cause miscompiles if the real value was smaller. Also refactor the TBAA metadata into a parent class since it's shared by both children anyway. llvm-svn: 349465
*	hwasan: Move ctor into a comdat.	Peter Collingbourne	2018-12-17	1	-12/+20
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D55733 llvm-svn: 349413
*	[AggressiveInstCombine] convert rotate with guard branch into funnel shift ↵	Sanjay Patel	2018-12-17	1	-1/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(PR34924) Now, that we have funnel shift intrinsics, it should be safe to convert this form of rotate to it. In the worst case (a target that doesn't have rotate instructions), we will expand this into a branch-less sequence of ALU ops (neg/and/and/lshr/shl/or) in the backend, so it's still very likely to be a perf improvement over the original code. The motivating source code pattern for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 Background: I looked at several different options before deciding where to try this - instcombine, simplifycfg, CGP - because it doesn't fit cleanly anywhere AFAIK. The backend (CGP, SDAG, GlobalIsel?) is too late for what we're trying to accomplish. We want to have the IR converted before we reach things like vectorization because the reduced code can make a loop much simpler to transform. Technically, this could be included in instcombine, but it's a large pattern match that includes control-flow, so it just felt wrong to stuff into there (although I have a draft of that patch). Similarly, this could be part of simplifycfg, but all of this pattern matching is a stretch. So we're left with our relatively new dumping ground for homeless transforms: aggressive-instcombine. This only runs at -O3, but that seems like a reasonable limitation given that source code has many options to avoid this pattern (including the recently added clang intrinsics for rotates). I'm including a PhaseOrdering test because we require the teamwork of 3 passes (aggressive-instcombine, instcombine, simplifycfg) to get this into the minimal IR form that we want. That test shows a bug with the new pass manager that's independent of this change (but it will be masked if we canonicalize harder to funnel shift intrinsics in instcombine). Differential Revision: https://reviews.llvm.org/D55604 llvm-svn: 349396
*	[InstCombine] don't widen an arbitrary sequence of vector ops (PR40032)	Sanjay Patel	2018-12-17	2	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is shown specifically for a case with vector multiply here: https://bugs.llvm.org/show_bug.cgi?id=40032 ...and this might mask the original backend bug for ARM shown in: https://bugs.llvm.org/show_bug.cgi?id=39967 As the test diffs here show, we were (and probably still aren't) doing these kinds of transforms in a principled way. We are producing more or equal wide instructions than we started with in some cases, so we still need to restrict/correct other transforms from overstepping. If there are perf regressions from this change, we can either carve out exceptions to the general IR rules, or improve the backend to do these transforms when we know the transform is profitable. That's probably similar to a change like D55448. Differential Revision: https://reviews.llvm.org/D55744 llvm-svn: 349389
*	[EarlyCSE] If DI can't be salvaged, mark it as unavailable.	Davide Italiano	2018-12-17	1	-1/+2
\| \| \| \| \| \|	Fixes PR39874. llvm-svn: 349323
*	Add NetBSD support in needsRuntimeRegistrationOfSectionRange.	Kamil Rytarowski	2018-12-15	1	-0/+1
\| \| \| \| \| \|	Use linker script magic to get data/cnts/name start/end. llvm-svn: 349277
*	Register kASan shadow offset for NetBSD/amd64	Kamil Rytarowski	2018-12-15	1	-3/+7
\| \| \| \| \| \| \|	The NetBSD x86_64 kernel uses the 0xdfff900000000000 shadow offset. llvm-svn: 349276
*	[NewGVN] Update use counts for SSA copies when replacing them by their operands.	Florian Hahn	2018-12-15	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code relies on LeaderUseCount to determine if we can remove an SSA copy, but in that the LeaderUseCount does not refer to the SSA copy. If a SSA copy is a dominating leader, we use the operand as dominating leader instead. This means we removed a user of a ssa copy and we should decrement its use count, so we can remove the ssa copy once it becomes dead. Fixes PR38804. Reviewers: efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D51595 llvm-svn: 349217
*	[Util] Refer to [s\|z]exts of args when converting dbg.declares (fix PR35400)	Vedant Kumar	2018-12-15	1	-27/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When converting dbg.declares, if the described value is a [s\|z]ext, refer to the ext directly instead of referring to its operand. This fixes a narrowing bug (the debugger got the sign of a variable wrong, see llvm.org/PR35400). The main reason to refer to the ext's operand was that an optimization may remove the ext itself, leading to a dropped variable. Now that InstCombine has been taught to use replaceAllDbgUsesWith (r336451), this is less of a concern. Other passes can/should adopt this API as needed to fix dropped variable bugs. Differential Revision: https://reviews.llvm.org/D51813 llvm-svn: 349214
*	[TransformWarning] Do not warn missed transformations in optnone functions.	Michael Kruse	2018-12-14	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Optimization transformations are intentionally disabled by the 'optnone' function attribute. Therefore do not warn if transformation metadata is still present. Using the legacy pass manager structure, the `skipFunction` method takes care for the optnone attribute (already called before this patch). For the new pass manager, there is no equivalent, so we check for the 'optnone' attribute manually. Differential Revision: https://reviews.llvm.org/D55690 llvm-svn: 349184
*	[Transforms] Preserve metadata when converting invoke to call.	Michael Kruse	2018-12-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The `changeToCall` function did not preserve the invoke's metadata. Currently, there is probably no metadata that depends on being applied on a CallInst or InvokeInst. Therefore we can replace the instruction's metadata. This fixes http://llvm.org/PR39994 Suggested-by: Moritz Kreutzer <moritz.kreutzer@siemens.com> Differential Revision: https://reviews.llvm.org/D55666 llvm-svn: 349170
*	Revert "[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6)"	Evgeniy Stepanov	2018-12-13	1	-3/+1
\| \| \| \| \| \| \| \|	Breaks sanitizer-android buildbot. This reverts commit af8443a984c3b491c9ca2996b8d126ea31e5ecbe. llvm-svn: 349092
*	[SampleFDO] handle ProfileSampleAccurate when initializing function entry count	Wei Mi	2018-12-13	1	-4/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	ProfileSampleAccurate is used to indicate the profile has exact match to the code to be optimized. Previously ProfileSampleAccurate is handled in ProfileSummaryInfo::isColdCallSite and ProfileSummaryInfo::isColdBlock. A better solution is to initialize function entry count to 0 when ProfileSampleAccurate is true, so we don't have to handle ProfileSampleAccurate in multiple places. Differential Revision: https://reviews.llvm.org/D55660 llvm-svn: 349088
*	Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail"	Nikita Popov	2018-12-13	1	-16/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. The previous version of this patch did not perform dependency checking properly: While the dependency is checked at the position of the memset, the used size must be that of the memcpy. Previously the size of the memset was used, which missed modification in the region MemSetSize..CopySize, resulting in miscompiles. The added tests cover variations of this issue. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 349078
*	[ThinLTO] Compute synthetic function entry count	Easwaran Raman	2018-12-13	2	-3/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076
*	[LoopUtils] Use i32 instead of `void`.	Davide Italiano	2018-12-13	1	-1/+1
\| \| \| \| \| \| \| \|	The actual type of the first argument of the @dbg intrinsic doesn't really matter as we're setting it to `undef`, but the bitcode reader is picky about `void` types. llvm-svn: 349069
*	[asan] Don't check ODR violations for particular types of globals	Vitaly Buka	2018-12-13	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: private and internal: should not trigger ODR at all. unnamed_addr: current ODR checking approach fail and rereport false violation if a linker merges such globals linkonce_odr, weak_odr: could cause similar problems and they are already not instrumented for ELF. Reviewers: eugenis, kcc Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55621 llvm-svn: 349015
*	Revert r348645 - "[MemCpyOpt] memset->memcpy forwarding with undef tail"	David L. Jones	2018-12-13	1	-30/+16
\| \| \| \| \| \| \|	This revision caused trucated memsets for structs with padding. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html llvm-svn: 349002
*	[LoopUtils] Prefer a set over a map. NFCI.	Davide Italiano	2018-12-13	1	-6/+4
\| \| \| \|	llvm-svn: 348999
*	[LoopDeletion] Update debug values after loop deletion.	Davide Italiano	2018-12-12	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When loops are deleted, we don't keep track of variables modified inside the loops, so the DI will contain the wrong value for these. e.g. int b() { int i; for (i = 0; i < 2; i++) ; patatino(); return a; -> 6 patatino(); 7 return a; 8 } 9 int main() { b(); } (lldb) frame var i (int) i = 0 We mark instead these values as unavailable inserting a @llvm.dbg.value(undef to make sure we don't end up printing an incorrect value in the debugger. We could consider doing something fancier, for, e.g. constants, in the future. PR39868. rdar://problem/46418795) Differential Revision: https://reviews.llvm.org/D55299 llvm-svn: 348988
*	[InstCombine] Fix negative GEP offset evaluation for 32-bit pointers	Nikita Popov	2018-12-12	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes https://bugs.llvm.org/show_bug.cgi?id=39908. The evaluateGEPOffsetExpression() function simplifies GEP offsets for use in comparisons against zero, basically by converting XScale+Offset==0 to X+Offset/Scale==0 if Scale divides Offset. However, before this is done, Offset is masked down to the pointer size. This results in incorrect results for negative Offsets, because we basically end up dividing the 32-bit offset zero* extended to 64-bit bits (rather than sign extended). Fix this by explicitly sign extending the truncated value. Differential Revision: https://reviews.llvm.org/D55449 llvm-svn: 348987
*	[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6)	Ryan Prichard	2018-12-12	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The change is needed to support ELF TLS in Android. See D55581 for the same change in compiler-rt. Reviewers: srhines, eugenis Reviewed By: eugenis Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D55592 llvm-svn: 348983
*	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.	Michael Kruse	2018-12-12	13	-83/+633
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944