bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[MCJIT] Make RTDyldMemoryManager::getSymbolAddress's behaviour more consistent.	Lang Hames	2014-09-20	2	-15/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies RTDyldMemoryManager::getSymbolAddress(Name)'s behavior to make it consistent with how clients are using it: Name should be mangled, and getSymbolAddress should demangle it on the caller's behalf before looking the name up in the process. This patch also fixes the one client (MCJIT::getPointerToFunction) that had been passing unmangled names (by having it pass mangled names instead). Background: RTDyldMemoryManager::getSymbolAddress(Name) has always used a re-try mechanism when looking up symbol names in the current process. Prior to this patch getSymbolAddress first tried to look up 'Name' exactly as the user passed it in and then, if that failed, tried to demangle 'Name' and re-try the look up. The implication of this behavior is that getSymbolAddress expected to be called with unmangled names, and that handling mangled names was a fallback for convenience. This is inconsistent with how clients (particularly the RuntimeDyldImpl subclasses, but also MCJIT) usually use this API. Most clients pass in mangled names, and succeed only because of the fallback case. For clients passing in mangled names, getSymbolAddress's old behavior was actually dangerous, as it could cause unmangled names in the process to shadow mangled names being looked up. For example, consider: foo.c: int _x = 7; int x() { return _x; } foo.o: 000000000000000c D __x 0000000000000000 T _x If foo.c becomes part of the process (E.g. via dlopen("libfoo.dylib")) it will add symbols 'x' (the function) and '_x' (the variable) to the process. However jit clients looking for the function 'x' will be using the mangled function name '_x' (note how function 'x' appears in foo.o). When getSymbolAddress goes looking for '_x' it will find the variable instead, and return its address and in place of the function, leading to JIT'd code calling the variable and crashing (if we're lucky). By requiring that getSymbolAddress be called with mangled names, and demangling only when we're about to do a lookup in the process, the new behavior implemented in this patch should eliminate any chance of names being shadowed during lookup. There's no good way to test this at the moment: This issue only arrises when looking up process symbols (not JIT'd symbols). Any test case would have to generate a platform-appropriate dylib to pass to llvm-rtdyld, and I'm not aware of any in-tree tool for doing this in a portable way. llvm-svn: 218187
*	llvm-cov: Allow creating CoverageMappings from filenames	Justin Bogner	2014-09-20	1	-0/+14
\| \| \| \|	llvm-svn: 218185
*	llvm-cov: Disentangle the coverage data logic from the display (NFC)	Justin Bogner	2014-09-20	1	-0/+275
\| \| \| \| \| \| \| \| \| \|	This splits the logic for actually looking up coverage information from the logic that displays it. These were tangled rather thoroughly so this change is a bit large, but it mostly consists of moving things around. The coverage lookup logic itself now lives in the library, rather than being spread between the library and the tool. llvm-svn: 218184
*	llvm-cov: Move some reader debug output out of the tool.	Justin Bogner	2014-09-20	1	-0/+15
\| \| \| \| \| \| \|	This debug output is really for testing CoverageMappingReader, not the llvm-cov tool. Move it to where it can be more useful. llvm-svn: 218183
*	Using a deque to manage the stack of nodes is faster here.	Lenny Maiorani	2014-09-20	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	Vector is slow due to many reallocations as the size regularly changes in unpredictable ways. See the investigation provided on the mailing list for more information: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120116/135228.html llvm-svn: 218182
*	MC: Treat ReadOnlyWithRel and ReadOnlyWithRelLocal as ReadOnly for COFF	David Majnemer	2014-09-20	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	A problem with our old behavior becomes observable under x86-64 COFF when we need a read-only GV which has an initializer which is referenced using a relocation: we would mark the section as writable. Marking the section as writable interferes with section merging. This fixes PR21009. llvm-svn: 218179
*	[x86] Teach the v4f32 path of the new shuffle lowering to handle the	Chandler Carruth	2014-09-20	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tricky case of single-element insertion into the zero lane of a zero vector. We can't just use the same pattern here as we do in every other vector type because the general insertion logic can handle insertion into the non-zero lane of the vector. However, in SSE4.1 with v4f32 vectors we have INSERTPS that is a much better choice than the generic one for such lowerings. But INSERTPS can do lots of other lowerings as well so factoring its logic into the general insertion logic doesn't work very well. We also can't just extract the core common part of the general insertion logic that is faster (forming VZEXT_MOVL synthetic nodes that lower to MOVSS when they can) because VZEXT_MOVL is often faster than a blend while INSERTPS is slower! So instead we do a restrictive condition on attempting to use the generic insertion logic to narrow it to those cases where VZEXT_MOVL won't need a shuffle afterward and thus will do better than INSERTPS. Then we try blending. Then we go back to INSERTPS. This still doesn't generate perfect code for some silly reasons that can be fixed by tweaking the td files for lowering VZEXT_MOVL to use XORPS+BLENDPS when available rather than XORPS+MOVSS when the input ends up in a register rather than a load from memory -- BLENDPSrr has twice the reciprocal throughput of MOVSSrr. Don't you love this ISA? llvm-svn: 218177
*	[x86] Refactor the code for emitting INSERTPS to reuse the zeroable mask	Chandler Carruth	2014-09-20	1	-25/+15
\| \| \| \| \| \| \|	analysis used elsewhere. This removes the last duplicate of this logic. Also simplify the code here quite a bit. No functionality changed. llvm-svn: 218176
*	[x86] Generalize the single-element insertion lowering to work with	Chandler Carruth	2014-09-20	1	-13/+45
\| \| \| \| \| \| \| \| \| \| \| \|	floating point types and use it for both v2f64 and v2i64 single-element insertion lowering. This fixes the last non-AVX performance regression test case I've gotten of for the new vector shuffle lowering. There is obvious analogous lowering for v4f32 that I'll add in a follow-up patch (because with INSERTPS, v4f32 requires special treatment). After that, its AVX stuff. llvm-svn: 218175
*	[x86] Replace some duplicated logic reasoning about whether particular	Chandler Carruth	2014-09-20	1	-13/+6
\| \| \| \| \| \| \| \| \| \|	vector lanes can be modeled as zero with a call to the new function that computes a bit-vector representing that information. No functionality changed here, but will allow doing more clever things with the zero-test. llvm-svn: 218174
*	Fix crash with an insertvalue that produces an empty object.	Peter Collingbourne	2014-09-20	1	-0/+6
\| \| \| \|	llvm-svn: 218171
*	[X86] Erase some obsolete comments from README.txt	Robin Morisset	2014-09-19	1	-177/+0
\| \| \| \| \| \| \| \| \| \| \|	I just tried reproducing some of the optimization failures in README.txt in the X86 backend, and many of them could not be reproduced. In general the entire file appears quite bit-rotted, whatever interesting parts remain should be moved to bugzilla, and the rest deleted. I did not spend the time to do that, so I just deleted the few I tried reproducing which are obsolete, to save some time to whoever will find the courage to do it. llvm-svn: 218170
*	constify the TargetMachine being passed through the Mips subtarget	Eric Christopher	2014-09-19	8	-15/+18
\| \| \| \| \| \|	creation. llvm-svn: 218169
*	Converting InstrProf's error_category to a ManagedStatic to avoid static ↵	Chris Bieneman	2014-09-19	1	-2/+4
\| \| \| \| \| \|	constructors and destructors. llvm-svn: 218168
*	DIBuilder: Delete dead code, NFC	Duncan P. N. Exon Smith	2014-09-19	1	-28/+0
\| \| \| \| \| \| \|	There are two versions of `DIBuilder::createObjCIVar()`. Delete the one that's apparently dead. llvm-svn: 218167
*	Converting SpillPlacement's BlockFrequency threshold to a ManagedStatic to ↵	Chris Bieneman	2014-09-19	1	-3/+4
\| \| \| \| \| \|	avoid static constructors and destructors. llvm-svn: 218163
*	[FastIsel][AArch64] Fix a think-o in address computation.	Juergen Ributzka	2014-09-19	1	-20/+27
\| \| \| \| \| \| \| \| \| \|	When looking through sign/zero-extensions the code would always assume there is such an extension instruction and use the wrong operand for the address. There was also a minor issue in the handling of 'AND' instructions. I accidentially used a 'cast' instead of a 'dyn_cast'. llvm-svn: 218161
*	Converting object's error_category to a ManagedStatic to avoid static ↵	Chris Bieneman	2014-09-19	1	-2/+4
\| \| \| \| \| \|	constructors and destructors. llvm-svn: 218160
*	[x86] Hoist a function up to the rest of the non-type-specific lowering	Chandler Carruth	2014-09-19	1	-75/+74
\| \| \| \| \| \| \| \| \|	helpers, and re-flow the logic to use early exit and be a bit more readable. No functionality changed. llvm-svn: 218155
*	Converting the JITDebugLock mutex to a ManagedStatic to avoid the static ↵	Chris Bieneman	2014-09-19	1	-4/+4
\| \| \| \| \| \|	constructor and destructor. llvm-svn: 218154
*	[x86] Hoist the actual lowering logic into a helper function to separate	Chandler Carruth	2014-09-19	1	-74/+89
\| \| \| \| \| \| \| \|	it from the shuffle pattern matching logic. Also cleaned up variable names, comments, etc. No functionality changed. llvm-svn: 218152
*	Converting FuncNames to a ManagedStatic to avoid static constructors and ↵	Chris Bieneman	2014-09-19	1	-14/+14
\| \| \| \| \| \|	destructors. llvm-svn: 218151
*	R600/SI: Fix config value for number of gprs	Tom Stellard	2014-09-19	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In r217636, the value stored in KernelInfo.Num[VS]GPRSs was changed from the highest GPR index used to the number of gprs in order to be consistent with the name of the variable. The code writing the config values still assumed that the value in this variable was the highest GPR index used, which caused the compiler to over report the number of GPRs being used. https://bugs.freedesktop.org/show_bug.cgi?id=84089 llvm-svn: 218150
*	Eliminating static destructor for the BitCodeErrorCategory by converting to ↵	Chris Bieneman	2014-09-19	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a ManagedStatic. Summary: This is part of the overall goal of removing static initializers from LLVM. Reviewers: chandlerc Reviewed By: chandlerc Subscribers: chandlerc, llvm-commits Differential Revision: http://reviews.llvm.org/D5416 llvm-svn: 218149
*	[x86] Fully generalize the zext lowering in the new vector shuffle	Chandler Carruth	2014-09-19	1	-33/+91
\| \| \| \| \| \| \| \| \| \| \| \|	lowering to support both anyext and zext and to custom lower for many different microarchitectures. Using this allows us to get exactly the right code for zext and anyext shuffles in all the vector sizes. For v16i8, the improvement is huge. The new SSE2 test case added I refused to add before this because it was sooooo muny instructions. llvm-svn: 218143
*	Add hsail and amdil64 to Triple	Matt Arsenault	2014-09-19	1	-5/+29
\| \| \| \|	llvm-svn: 218142
*	Omit DW_TAG_subprograms for subprograms without inlined subroutines when ↵	David Blaikie	2014-09-19	2	-24/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	producing -gmlt data To reduce the size of -gmlt data, skip the subprograms without any inlined subroutines. Since we've now got the ability to make these determinations in the backend (funnily enough - we added the flag so we wouldn't produce ranges under -gmlt, but with this change we use the flag, but go back to producing ranges under -gmlt). Instead, just produce CU ranges to inform the consumer which parts of the code are described by this CU's line table. Tools could inspect the line table directly to compute the range, but the CU ranges only seem to be about 0.5% of object/executable size, so I'm not too worried about teaching llvm-symbolizer that trick just yet - it's certainly a possible piece of future work. Update an llvm-symbolizer test just to demonstrate that this schema is acceptable there (if it wasn't, the compiler-rt tests would catch this, but good to have an in-llvm-tree test for llvm-symbolizer's behavior here) Building the clang binary with -gmlt with this patch reduces the total size of object files by 5.1% (5.56% without ranges) without compression and the executable by 4.37% (4.75% without ranges). llvm-svn: 218129
*	Change DwarfCompileUnit::createGlobalVariable to getOrCreateGlobalVariable.	Frederic Riss	2014-09-19	3	-12/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will allow to request the creation of a forward delacred variable at is point of use (for imported declarations, this will be DwarfDebug::constructImportedEntityDIE) rather than having to put the forward decl in a retention list. Note that getOrCreateGlobalVariable returns the actual definition DIE when the routine creates a declaration and a definition DIE. If you agree this is the right behavior, then I'll have a followup patch that registers the definition in the DIE map instead of the declaration as it is today (this 'breaks' only one test, where we test that the imported entity is the declaration). I'm not sure what's best here, but it's easy enough for a consumer to follow the DW_AT_specification link to get to the declaration, whereas it takes more work to find the actual definition from a declaration DIE. Reviewers: echristo, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5381 llvm-svn: 218126
*	Turn local DWARFContext helpers getFileNameForUnit() and ↵	Frederic Riss	2014-09-19	3	-58/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getFileLineInfoForCompileUnit() into full-blowm DWARFDebugLine::LineTable methods. Summary: getFileNameForUnit() is basically a wrapper around LineTable::getFileNameByIndex(). Fold its additional functionality (adding the DWARFUnit compilation dir) into LineTable::getFileNameByIndex(). getFileLineInfoForCompileUnit() is a wrapper around getFileNameForUnit(). As a function to search the line information by address, it seems natural to put it in the LineTable also. Before this commit only the Context with its private helpers could do Linetable lookups. This newly exposed feature will be used by the DIE dumping code to get access to file information referenced in DIE attributes. This commit has already been partly reviewed in D5192 and contained an additional and a bit controversial 'realpath' call that is left out of this patch. We can reinstate that realpath code later if it is desirable. Test Plan: The patch contains no tests as it should be functionally equivalent to the previous code. As requested in the last review, I checked if the relative path handling copied from the Context to LineTable::getFileNameByIndex() was covered, and indeed the symbolizer tests fail if it is removed. Reviewers: dblaikie, echristo, aprantl, samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5354 llvm-svn: 218125
*	Elide unnecessary DenseMap copy.	Benjamin Kramer	2014-09-19	1	-3/+3
\| \| \| \| \| \|	No functionality change. llvm-svn: 218122
*	Optionally enable more-aggressive FMA formation in DAGCombine	Hal Finkel	2014-09-19	3	-5/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The heuristic used by DAGCombine to form FMAs checks that the FMUL has only one use, but this is overly-conservative on some systems. Specifically, if the FMA and the FADD have the same latency (and the FMA does not compete for resources with the FMUL any more than the FADD does), there is no need for the restriction, and furthermore, forming the FMA leaving the FMUL can still allow for higher overall throughput and decreased critical-path length. Here we add a new TLI callback, enableAggressiveFMAFusion, false by default, to elide the hasOneUse check. This is enabled for PowerPC by default, as most PowerPC systems will benefit. Patch by Olivier Sallenave, thanks! llvm-svn: 218120
*	[x86] Recognize that we can use duplication to widen v16i8 shuffles due	Chandler Carruth	2014-09-19	1	-3/+3
\| \| \| \| \| \| \| \|	to undef lanes as well as defined widenable lanes. This dramatically improves the lowering we use for undef-shuffles in a zext-ish pattern for SSE2. llvm-svn: 218115
*	[x86] Teach the new vector shuffle lowering to also use pmovzx for v4i32	Chandler Carruth	2014-09-19	1	-1/+7
\| \| \| \| \| \| \| \| \|	shuffles that are zext-ing. Not a lot to see here; the undef lane variant is better handled with pshufd, but this improves the actual zext pattern. llvm-svn: 218112
*	[x86] Add a dedicated lowering path for zext-compatible vector shuffles	Chandler Carruth	2014-09-19	1	-0/+134
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to the new vector shuffle lowering code. This allows us to emit PMOVZX variants consistently for patterns where it is a viable lowering. This instruction is both fast and allows us to fold loads into it. This only hooks the new lowering up for i16 and i8 element widths, mostly so I could manage the change to the tests. I'll add the i32 one next, although it is significantly less interesting. One thing to note is that we already had some tests for these patterns but those tests had far less horrible instructions. The problem is that those tests weren't checking the strict start and end of the instruction sequence. =[ As a consequence something changed in the lowering making us generate TERRIBLE code for these patterns in SSE2 through SSSE3. I've consolidated all of the tests and spelled out the madness that we currently emit for these shuffles. I'm going to try to figure out what has gone wrong here. llvm-svn: 218102
*	Optimize sext/zext insertion algorithm in back-end.	Jiangning Liu	2014-09-19	3	-8/+61
\| \| \| \| \| \| \| \|	With this optimization, we will not always insert zext for values crossing basic blocks, but insert sext if the users of a value crossing basic block has preference of sign predicate. llvm-svn: 218101
*	Omit DW_AT_frame_base under -gmlt for size	David Blaikie	2014-09-19	1	-3/+7
\| \| \| \|	llvm-svn: 218100
*	Describe the -gmlt optimization committed in the previous revision.	David Blaikie	2014-09-19	1	-0/+1
\| \| \| \|	llvm-svn: 218099
*	Omit all the extra static attributes on subprograms in -gmlt	David Blaikie	2014-09-19	1	-0/+3
\| \| \| \| \| \| \| \|	This omission will be done in a fancier manner once we're dealing with "put gmlt in the skeleton CUs under fission" - it'll have to be conditional on the kind of CU we're emitting into (skeleton or gmlt). llvm-svn: 218098
*	Fix an it's vs. its typo.	Hans Wennborg	2014-09-19	1	-1/+1
\| \| \| \|	llvm-svn: 218093
*	R600: Better fix for bug 20982	Matt Arsenault	2014-09-19	1	-6/+3
\| \| \| \| \| \|	Just do the left shift as unsigned to avoid the UB. llvm-svn: 218092
*	Use cast<> instead of unchecked dyn_cast<>	Matt Arsenault	2014-09-18	1	-1/+1
\| \| \| \|	llvm-svn: 218085
*	LTO: introduce object file-based on-disk module format.	Peter Collingbourne	2014-09-18	4	-17/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This format is simply a regular object file with the bitcode stored in a section named ".llvmbc", plus any number of other (non-allocated) sections. One immediate use case for this is to accommodate compilation processes which expect the object file to contain metadata in non-allocated sections, such as the ".go_export" section used by some Go compilers [1], although I imagine that in the future we could consider compiling parts of the module (such as large non-inlinable functions) directly into the object file to improve LTO efficiency. [1] http://golang.org/doc/install/gccgo#Imports Differential Revision: http://reviews.llvm.org/D4371 llvm-svn: 218078
*	[ARM] Do not perform a tail call when the caller returns several values.	Quentin Colombet	2014-09-18	1	-1/+11
\| \| \| \| \| \| \| \| \| \|	The fix is slightly different then x86 (see r216117) because the number of values attached to a return can vary even for a single returned value (e.g., f64 yields two returned values). <rdar://problem/18352998> llvm-svn: 218076
*	Restore "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors"	Robin Morisset	2014-09-18	2	-4/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch was originally in D5304 (I could not find a way to reopen that revision). It was accepted, commited and broke the build bots because the overloading of the constructor of ArrayRef for braced initializer lists is not supported by all toolchains. I then reverted it, and propose this fixed version that uses a plain C array instead in makeDMB (that array is then converted implicitly to an ArrayRef, but that is not behind an ifdef). Could someone confirm me whether initialization lists for plain C arrays are supported by every toolchain used to build llvm ? Otherwise I can just initialize the array in the old way: args[0] = ...; .. ; args[5] = ...; Below is the description of the original patch: ``` I had only tested this code for ARMv7 and ARMv8. This patch adds several fallback paths if the processor does not support dmb ish: - dmb sy if a cortex-M with support for dmb - mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB) These fallback paths were chosen based on the code for fence seq_cst. Thanks to luqmana for having noticed this bug. ``` Test Plan: Added more cases to atomic-load-store.ll + make check-all Reviewers: jfb, t.p.northover, luqmana Subscribers: llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D5386 llvm-svn: 218066
*	Reverting NFC changes from r218050. Instead, the warning was disabled for ↵	Aaron Ballman	2014-09-18	12	-14/+0
\| \| \| \| \| \|	GCC in r218059, so these changes are no longer required. llvm-svn: 218062
*	[MCJIT] Fix a debugging-output formatting bug in RuntimeDyld.	Lang Hames	2014-09-18	1	-1/+1
\| \| \| \| \| \| \|	The mismatched mask (7 vs (ColsPerRow-1)) could lead to partial lines being printed out of place. llvm-svn: 218061
*	Revert part of r218041.	Frederic Riss	2014-09-18	1	-0/+3
\| \| \| \| \| \| \| \|	The patch moved some logic around in an attempt to generate potentially more DW_AT_declaration attributes. The patch was flawed though and it stopped generating the attribute in some cases. llvm-svn: 218060
*	R600: Bug 20982 - Avoid undefined left shift of negative value	Matt Arsenault	2014-09-18	1	-3/+10
\| \| \| \| \| \| \|	I'm not sure what the hardware actually does, so don't bother trying to fold it for now. llvm-svn: 218057
*	[SKX] Deriving rmb multiclasses from general one (avx512_icmp_packed_rmb and ↵	Robert Khasanov	2014-09-18	1	-26/+12
\| \| \| \| \| \| \| \|	avx512_icmp_cc_rmb). Thanks Adam Nemet for notice about this. llvm-svn: 218051
*	Fixing a bunch of -Woverloaded-virtual warnings due to hiding ↵	Aaron Ballman	2014-09-18	12	-0/+14
\| \| \| \| \| \|	getSubtargetImpl from the base class. NFC. llvm-svn: 218050