bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[CodeView] Refactor / Rewrite TypeSerializer and TypeTableBuilder.	Zachary Turner	2017-11-28	10	-462/+597
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivation behind this patch is that future directions require us to be able to compute the hash value of records independently of actually using them for de-duplication. The current structure of TypeSerializer / TypeTableBuilder being a single entry point that takes an unserialized type record, and then hashes and de-duplicates it is not flexible enough to allow this. At the same time, the existing TypeSerializer is already extremely complex for this very reason -- it tries to be too many things. In addition to serializing, hashing, and de-duplicating, ti also supports splitting up field list records and adding continuations. All of this functionality crammed into this one class makes it very complicated to work with and hard to maintain. To solve all of these problems, I've re-written everything from scratch and split the functionality into separate pieces that can easily be reused. The end result is that one class TypeSerializer is turned into 3 new classes SimpleTypeSerializer, ContinuationRecordBuilder, and TypeTableBuilder, each of which in isolation is simple and straightforward. A quick summary of these new classes and their responsibilities are: - SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of bytes. Does not do any hashing. Every time you call it, it will re-serialize and return bytes again. The same instance can be re-used over and over to avoid re-allocations, and in exchange for this optimization the bytes returned by the serializer only live until the caller attempts to serialize a new record. - ContinuationRecordBuilder : Turns a FieldList-like record into a series of fragments. Does not do any hashing. Like SimpleTypeSerializer, returns references to privately owned bytes, so the storage is invalidated as soon as the caller tries to re-use the instance. Works equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a long-standing theoretical limitation of the previous implementation. - TypeTableBuilder : Accepts sequences of bytes that the user has already serialized, and inserts them by de-duplicating with a hash table. For the sake of convenience and efficiency, this class internally stores a SimpleTypeSerializer so that it can accept unserialized records. The same is not true of ContinuationRecordBuilder. The user is required to create their own instance of ContinuationRecordBuilder. Differential Revision: https://reviews.llvm.org/D40518 llvm-svn: 319198
*	[X86][X87] Tag FP_TO_INT_IN_MEM pseudos with hasNoSchedulingInfo	Simon Pilgrim	2017-11-28	1	-2/+2
\| \| \| \| \| \|	We don't need scheduling info for pseudos llvm-svn: 319197
*	[CodeGen] Separate MachineOperand implementation from MachineInstr	Francis Visoiu Mistrih	2017-11-28	3	-701/+752
\| \| \| \| \| \| \| \|	Move the implementation to its own file. Differential Revision: https://reviews.llvm.org/D40419 llvm-svn: 319194
*	[CodeGen] Cleanup MachineOperand	Francis Visoiu Mistrih	2017-11-28	1	-24/+0
\| \| \| \| \| \| \| \|	* clang-format * move doxygen from the implementation to headers * remove duplicate doxygen llvm-svn: 319193
*	AMDGPU: Add num spilled s/vgprs to metadata	Konstantin Zhuravlyov	2017-11-28	2	-0/+6
\| \| \| \| \| \| \| \|	This was requested by tools. Differential Revision: https://reviews.llvm.org/D40321 llvm-svn: 319192
*	[CodeGen] Print register names in lowercase in both MIR and debug output	Francis Visoiu Mistrih	2017-11-28	54	-268/+270
\| \| \| \| \| \| \| \| \| \| \|	As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187
*	[WebAssembly] Support bitcasted function addresses with varargs.	Dan Gohman	2017-11-28	2	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	Generalize FixFunctionBitcasts to handle varargs functions. This in particular fixes the case where clang bitcasts away a varargs when calling a K&R-style function. This avoids interacting with tricky ABI details because it operates at the LLVM IR level before varargs ABI details are exposed. This fixes PR35385. llvm-svn: 319186
*	DAG: Legalize truncstores to illegal int types	Matt Arsenault	2017-11-28	1	-6/+16
\| \| \| \| \| \| \|	Truncate to a legal int type, and produce a new truncstore from a narrower type. llvm-svn: 319185
*	[X86][X87] Tag FTST x87 instruction scheduler class	Simon Pilgrim	2017-11-28	1	-1/+2
\| \| \| \| \| \|	Looking through Agner, FTST is very similar to generic float compare behaviour, so I've added them to the existing IIC_FCOMI (WriteFAdd) tags. llvm-svn: 319184
*	[X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classes	Simon Pilgrim	2017-11-28	3	-16/+30
\| \| \| \| \| \| \|	Atom's FABS/FCHS/FSQRT latencies taken from Agner. Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction. llvm-svn: 319175
*	Use getStoreSize() in various places instead of 'BitSize >> 3'.	Jonas Paulsson	2017-11-28	8	-38/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is needed for cases when the memory access is not as big as the width of the data type. For instance, storing i1 (1 bit) would be done in a byte (8 bits). Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a size of 0, which for instance makes alias analysis return NoAlias even when it shouldn't. There are no tests as this was done as a follow-up to the bugfix for the case where this was discovered (r318824). This handles more similar cases. Review: Björn Petterson https://reviews.llvm.org/D40339 llvm-svn: 319173
*	[Support] Merge toLower / toUpper implementations	Francis Visoiu Mistrih	2017-11-28	2	-27/+18
\| \| \| \| \| \|	Merge the ones from StringRef and StringExtras. llvm-svn: 319171
*	[CodeGen] Rename functions PrintReg* to printReg*	Francis Visoiu Mistrih	2017-11-28	66	-301/+301
\| \| \| \| \| \| \| \| \| \| \|	LLVM Coding Standards: Function names should be verb phrases (as they represent actions), and command-like function should be imperative. The name should be camel case, and start with a lower case letter (e.g. openFile() or isFoo()). Differential Revision: https://reviews.llvm.org/D40416 llvm-svn: 319168
*	[X86][3DNow] Add instruction itinerary and scheduling classes for ↵	Simon Pilgrim	2017-11-28	1	-6/+8
\| \| \| \| \| \|	femms/prefetch/prefetchw llvm-svn: 319167
*	[ARM][AArch64] Workaround ARM/AArch64 peculiarity in clearing icache.	Peter Smith	2017-11-28	1	-4/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Certain ARM implementations treat icache clear instruction as a memory read, and CPU segfaults on trying to clear cache on !PROT_READ page. We workaround this in Memory::protectMappedMemory by adding PROT_READ to affected pages, clearing the cache, and then setting desired protection. This fixes "AllocationTests/MappedMemoryTest.***/3" unit-tests on affected hardware. Reviewers: psmith, zatrazz, kristof.beyls, lhames Reviewed By: lhames Subscribers: llvm-commits, krytarowski, peter.smith, jgreenhalgh, aemerson, rengolin Patch by maxim-kuvrykov! Differential Revision: https://reviews.llvm.org/D40423 llvm-svn: 319166
*	Add a new pass to speculate around PHI nodes with constant (integer) ↵	Chandler Carruth	2017-11-28	4	-0/+819
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after everything that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 llvm-svn: 319164
*	[TailRecursionElimination] Skip debug intrinsics.	Florian Hahn	2017-11-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I think we do not need to analyze debug intrinsics here, as they should not impact codegen. This has 2 benefits: 1) slightly less work to do and 2) avoiding generating optimization remarks for converting calls to debug intrinsics to tail calls, which are not really helpful for users. Based on work by Sander de Smalen. Reviewers: davide, trentxintong, aprantl Reviewed By: aprantl Subscribers: llvm-commits, JDevlieghere Tags: #debug-info Differential Revision: https://reviews.llvm.org/D40440 llvm-svn: 319158
*	AMDGPU: Re-organize the outer loop of SILoadStoreOptimizer	Nicolai Haehnle	2017-11-28	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The entire algorithm operates per basic-block, so for cache locality it should be better to re-optimize a basic-block immediately rather than in a separate loop. I don't have performance measurements. Change-Id: I85106570bd623c4ff277faaa50ee43258e1ddcc5 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40344 llvm-svn: 319156
*	AMDGPU: Consistently check for immediates in SIInstrInfo::FoldImmediate	Nicolai Haehnle	2017-11-28	1	-23/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The PeepholeOptimizer pass calls this function solely based on checking DefMI->isMoveImmediate(), which only checks the MoveImm bit of the instruction description. So it's up to FoldImmediate itself to properly check that DefMI actually moves from an immediate. I don't have a separate test case for this, but the next patch introduces a test case which happens to crash without this change. This error is caught by the assertion in MachineOperand::getImm(). Change-Id: I88e7cdbcf54d75e1a296822e6fe5f9a5f095bbf8 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40342 llvm-svn: 319155
*	[SCEV][NFC] More efficient caching in CompareValueComplexity	Max Kazantsev	2017-11-28	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we use a set of pairs to cache responces like `CompareValueComplexity(X, Y) == 0`. If we had proved that `CompareValueComplexity(S1, S2) == 0` and `CompareValueComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareValueComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` that merges Values into equivalence sets so that any two values from the same set are equal from point of `CompareValueComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40429 llvm-svn: 319153
*	[COFF] Implement constructor priorities	Martin Storsjo	2017-11-28	1	-8/+29
\| \| \| \| \| \| \| \| \| \| \|	The priorities in the section name suffixes are zero padded, allowing the linker to just do a lexical sort. Add zero padding for .ctors sections in ELF as well. Differential Revision: https://reviews.llvm.org/D40407 llvm-svn: 319150
*	[SCEV][NFC] More efficient caching in CompareSCEVComplexity	Max Kazantsev	2017-11-28	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we use a set of pairs to cache responces like `CompareSCEVComplexity(X, Y) == 0`. If we had proved that `CompareSCEVComplexity(S1, S2) == 0` and `CompareSCEVComplexity(S2, S3) == 0`, this cache does not allow us to prove that `CompareSCEVComplexity(S1, S3)` is also `0`. This patch replaces this set with `EquivalenceClasses` any two values from the same set are equal from point of `CompareSCEVComplexity`. This, in particular, allows us to prove the fact from example above. Differential Revision: https://reviews.llvm.org/D40428 llvm-svn: 319149
*	[GVN] Prevent ScalarPRE from hoisting across instructions that don't pass ↵	Max Kazantsev	2017-11-28	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \|	control flow to successors This is to address a problem similar to those in D37460 for Scalar PRE. We should not PRE across an instruction that may not pass execution to its successor unless it is safe to speculatively execute it. Differential Revision: https://reviews.llvm.org/D38619 llvm-svn: 319147
*	[WebAssembly] Handle errors better in fast-isel.	Dan Gohman	2017-11-28	1	-12/+40
\| \| \| \| \| \| \| \| \|	Fast-isel routines need to bail out in the case that fast-isel fails on the operands. This fixes https://bugs.llvm.org/show_bug.cgi?id=35064 llvm-svn: 319144
*	[X86] Remove some unused pattern fragments from td file. NFC	Craig Topper	2017-11-28	1	-10/+0
\| \| \| \|	llvm-svn: 319143
*	[DAGCombine] Disable finding better chains for stores at O0	Simon Dardis	2017-11-28	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unoptimized IR can have linear sequences of stores to an array, where the initial GEP for the first store is formed from the pointer to the array, and the GEP for each store after the first is formed from the previous GEP with some offset in an inductive fashion. The (large) resulting DAG when analyzed by DAGCombine undergoes an excessive number of combines as each store node is examined every time its' offset node is combined with any child of the offset. One of the transformations is findBetterNeighborChains which assists MergeConsecutiveStores. The former relies on repeated chain walking to do its' work, however MergeConsecutiveStores is disabled at O0 which makes the transformation redundant. Any optimization level other than O0 would invoke InstCombine which would resolve the chain of GEPs into flat base + offset GEP for each store which does not exhibit the repeated examination of each store to the array. Disabling this optimization fixes an excessive compile time issue (30~ minutes for the test case provided) at O0. Reviewers: niravd, craig.topper, t.p.northover Differential Revision: https://reviews.llvm.org/D40193 llvm-svn: 319142
*	MachineVerifier: Improve register operand checks	Matthias Braun	2017-11-28	1	-78/+81
\| \| \| \| \| \| \| \| \| \| \| \|	This fixes cases where we wouldn't perform various register operand checks just because we didn't happen to have a definition in the MCInstrDesc. This changes the code to only skip the tests that actually depend on the MCInstrDesc definition. This makes the machine verifier spot the problem from https://llvm.org/PR33071 after the pass that actually caused it. llvm-svn: 319141
*	MachineVerifier: Improve PHI operand checking	Matthias Braun	2017-11-28	1	-28/+54
\| \| \| \| \| \| \| \| \| \| \| \|	Additional checks for phi operands: - first operand should be a virtual register def. It should not be tied, implicit, internalread, earlyclobber or a read. - The other operands should be register/mbb operands next to each other - The register operands should not be implicit, internalread, earlyclobber, debug or tied. - We can perform most of the PHI checks even for unreachable blocks. llvm-svn: 319140
*	Use FILE_FLAG_DELETE_ON_CLOSE for TempFile on windows.	Rafael Espindola	2017-11-28	2	-6/+80
\| \| \| \| \| \|	We won't see the temp file no more. llvm-svn: 319137
*	[X86] Make zero extend from v16i1/v8i1 to v16i8/v8i16/v16i16 not scalarize ↵	Craig Topper	2017-11-28	1	-0/+4
\| \| \| \| \| \|	under AVX512. llvm-svn: 319136
*	Move code. NFC.	Rafael Espindola	2017-11-28	1	-83/+85
\| \| \| \| \| \| \|	This moves the TempFile implementation so that it can use system specific code. llvm-svn: 319134
*	This reverts commit r319096 and r319097.	Rafael Espindola	2017-11-28	3	-165/+34
\| \| \| \| \| \| \| \| \|	Revert "[SROA] Propagate !range metadata when moving loads." Revert "[Mem2Reg] Clang-format unformatted parts of this file. NFCI." Davide says they broke a bot. llvm-svn: 319131
*	ARM: Fix PR32578	Matthias Braun	2017-11-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	https://llvm.org/PR32578 I simplified and converted the reproducer into a lit test. Patch by Vedant Kumar! llvm-svn: 319130
*	[WebAssembly] Fix trapping behavior in fptosi/fptoui.	Dan Gohman	2017-11-28	8	-19/+227
\| \| \| \| \| \| \| \| \| \| \| \|	This adds code to protect WebAssembly's `trunc_s` family of opcodes from values outside their domain. Even though such conversions have full undefined behavior in C/C++, LLVM IR's `fptosi` and `fptoui` do not, and only return undef. This also implements the proposed non-trapping float-to-int conversion feature and uses that instead when available. llvm-svn: 319128
*	SROA: Avoid creating a fragment expression that covers the entire variable.	Adrian Prantl	2017-11-28	1	-4/+9
\| \| \| \| \| \| \| \|	Fixes PR35416. https://bugs.llvm.org/show_bug.cgi?id=35416 llvm-svn: 319126
*	Move getVariableSize from Verifier.cpp into DIVariable::getSize() (NFC)	Adrian Prantl	2017-11-28	2	-26/+26
\| \| \| \|	llvm-svn: 319125
*	[X86] Remove unnecessary fp<->int setOperationAction lines from a hasVLX ↵	Craig Topper	2017-11-28	1	-7/+0
\| \| \| \| \| \| \| \|	block. NFCI These lines all exist identically either under SSE2, AVX2 or AVX512. Given that VLX implies all of those, these aren't providing anything new. llvm-svn: 319124
*	[X86] Remove duplicate calls to setOperationAction. NFCI	Craig Topper	2017-11-28	1	-2/+0
\| \| \| \| \| \|	These same calls exist a few lines down. llvm-svn: 319122
*	Add an F_Delete flag.	Rafael Espindola	2017-11-28	1	-0/+2
\| \| \| \| \| \|	For now this only changes the handle Access. llvm-svn: 319121
*	[DAGCombiner] Don't combine aext(setcc) if the setcc is already using the ↵	Craig Topper	2017-11-27	1	-8/+11
\| \| \| \| \| \| \| \| \| \|	target's preferred result type. With AVX512 vXi1 types are legal so we shouldn't be extending them. This change is similar to existing code in the zext(setcc) combine. llvm-svn: 319120
*	[DAGCombiner] Use EVT::changeVectorElementTypeToInteger() instead of ↵	Craig Topper	2017-11-27	1	-4/+1
\| \| \| \| \| \|	implementing manually. llvm-svn: 319119
*	Add OpenFlags to the create(Unique\|Temporary)File interfaces.	Rafael Espindola	2017-11-27	1	-14/+20
\| \| \| \| \| \| \|	This will allow a future F_Delete flag to be specified when we want the file to be automatically deleted on close. llvm-svn: 319117
*	[X86] Teach getSetCCResultType to handle more than just SimpleVTs when ↵	Craig Topper	2017-11-27	1	-15/+12
\| \| \| \| \| \| \| \|	looking at larger than 512-bit vectors. Which VTs are considered simple is determined by the superset of the legal types of all targets in LLVM. If we're looking at VTs that are going to be split down to 512-bits we should allow any VT not just simple ones since the simple list changes over time as new targets are added. llvm-svn: 319110
*	Fixed the ability to recursively get an attribute value from a DWARFDie.	Greg Clayton	2017-11-27	1	-10/+9
\| \| \| \| \| \| \| \|	The previous implementation would only look 1 DW_AT_specification or DW_AT_abstract_origin deep. This means DWARFDie::getName() would fail in certain cases. I ran into such a case while creating a tool that used the LLVM DWARF parser to generate a symbolication format so I have seen this in the wild. Differential Revision: https://reviews.llvm.org/D40156 llvm-svn: 319104
*	[X86] Remove lines that set v8f32 FP_ROUND/FP_EXTEND to Legal under AVX512. NFCI	Craig Topper	2017-11-27	1	-2/+0
\| \| \| \| \| \|	We don't do this for narrow vectors under AVX or SSE features. We also don't set them to Expand like we do for many vectors op. Nor does TargetLoweringBase.cpp. This leads me to believe these default to Legal. llvm-svn: 319103
*	[Mem2Reg] Clang-format unformatted parts of this file. NFCI.	Davide Italiano	2017-11-27	1	-28/+23
\| \| \| \|	llvm-svn: 319097
*	[SROA] Propagate !range metadata when moving loads.	Davide Italiano	2017-11-27	3	-32/+168
\| \| \| \| \| \| \| \| \| \| \| \| \|	This tries to propagate !range metadata to a pre-existing load when a load is optimized out. This is done instead of adding an assume because converting loads to and from assumes creates a lot of IR. Patch by Ariel Ben-Yehuda. Differential Revision: https://reviews.llvm.org/D37216 llvm-svn: 319096
*	[PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend ↵	Sanjay Patel	2017-11-27	4	-5/+19
\| \| \| \| \| \| \| \| \| \| \|	on arg rather than result This should fix PR31455: https://bugs.llvm.org/show_bug.cgi?id=31455 Differential Revision: https://reviews.llvm.org/D28314 llvm-svn: 319094
*	[PowerPC] Remove redundant TOC saves	Zaara Syeda	2017-11-27	3	-2/+87
\| \| \| \| \| \| \| \| \| \|	This patch adds a peep hole optimization to remove any redundant toc save instructions added as part of the call sequence for indirect calls. It removes any toc saves within a function that are dominated by another toc save. Differential Revision: https://reviews.llvm.org/D39736 llvm-svn: 319087
*	[SelectionDAG] Add a debug message when vector_shuffle nodes are created.	Craig Topper	2017-11-27	1	-1/+3
\| \| \| \| \| \|	We print a debug message when most nodes are created, but getVectorShuffle was missing. llvm-svn: 319085